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5 ISOLATED GENOMIC POLYNUCLEOTIDE FRAGMENTS FROM CHROMOSOME 7 
PRIORITY CLAIM 

This application claims priority under 35 U.S.C. § 19(e) to provisional application serial no. 
60/234,422, filed September 21, 2000 and is a continuation of application serial no. 09/957,956. filed 
10 Se ptember 21, 2001 , the contents of which are incorporated herein by reference. 

FIELD OF THE INVENTION 

The invention is directed to isolated genomic polynucleotide fragments that encode human 
SNARE YKT6, human Uv^-glucokinase, human adipocyte enhancer binding protein 1 (AEBP1) and 
15 DNA directed 50kD regulatory subunit (POLD2), vectors and hosts containing these fragments and 
fragments hybridizing to noncoding regions as well as antisense oligonucleotides to these fragments. 
The invention is further directed to methods of using these fragments to obtain SNARE YKT6, human 
Uv£f-glucokinase, AEBP1 protein and POLD2 and to diagnose, treat, prevent and/or ameliorate a 
pathological disorder. 



BACKGROUND OF THE INVENTION 

Chromosome 7 contains genes encoding, for example, epidermal growth factor receptor, 
collagen- 1 -Alpha- 1 -chain, SNARE YKT6, human ii^F-glucokinase, human adipocyte enhancer 
binding protein l_and DNA polymerase delta small subunit (POLD2). SNARE YKT6, human-liv^t 
25 glucokinase, human adipocyte enhancer binding protein Land DNA polymerase delta small subunit 
(POLD2) are discussed in further detail below. 

SNARE YKT6 

SNARE YKT6, a substrate for prenylation, is essential for vesicle-associated endoplasmic 
30 reticulum-Golgi transport (McNew, J.A. et al. J. Biol. Chem. 272, 17776-17783, 1997). It has been 
found that depletion of this function stops cell growth and manifests a transport block at the 
endoplasmic reticulum level. 

Human UwF-Glucokinase 

35 Human liv^-glucokinase (ATP:D-hexose 6-phosphotransferase) is thought to play a major 

role in glucose sensing in pancreatic islet beta cells (Tanizawa et al., 1992, Mol. Endocrinol. 6:1070- 
1081) and in the liver. Glucokinase defects have been observed in patients with noninsulin-dependent 
diabetes mellitus (NIDDM) patients. Mutations in the human Uw^glucokinase gene are thought to 
play a role in the early onset of NIDDM. The gene has been shown by Southern Blotting to exist as a 
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single copy on chromosome 7. It was further found to contain 10 exons including one exon 
expressed in islet beta cells and the other expressed in liver. 



Human Adipocyte Enhancer Binding Proteinl 

5 The adipocyte-enhancer binding protein LiAEBPl) is a transcriptional repressor having 

carboxy peptidase B-like activity which binds to a regulatory sequence (adipocyte enhancer 1, AE-1) 
located in the proximal promoter region of the adipose P2 (aP2) gene, which encodes the adipocyte 
fatty acid binding protein (Muise et aL, 1999, Biochem. J. 343:341-345). B-like carboxypeptidases 
remove C-terminal arginine and lysine residues and participate in the release of active peptides, such as 

10 insulin, alter receptor specificity for polypeptides and terminate polypeptide activity (Skidgel, 1988, 
Trends Pharmacol. Sci. 9:299-304). For example, they are thought to be involved in the onset of 
obesity (Naggert et aL, 1995, Nat. Genet. 10:1335-1342). It has been reported that obese and 
hyperglycemic mice homozygous for the fat mutation contain a mutation in the CP-E gene. 

Full length cDNA clones encoding AEBP1 have been isolated from human osteoblast and 

15 adipose tissue (Ohno et aL, 1996, Biochem. Biophys Res. Commun. 228:411-414). Two forms have 
been found to exist due to alternative splicing. This gene appears to play a significant role in 
regulating adipogenesis. In addition to playing a role in obesity, adipogenesis may play a role in 
ostopenic disorders. It has been postulated that adipogenesis inhibitors may be used to treat 
osteopenic disorders (Nuttal et aL, 2000, Bone 27:177-184). 

20 

DNA Polymerase Delta Small Subunit (POLD2) 

DNA polymerase delta core is a heterodimeric enzyme with a catalytic subunit of 125 kD and 
a second subunit of 50 kD and is an essential enzyme for DNA replication and DNA repair (Zhang et 
aL, 1995, Genomics 29:179-186). cDNAs encoding the small subunit have been cloned and 
25 sequenced. The gene for the small subunit has been localized to human chromosome 7 via PCR 
analysis of a panel of human-hamster hybrid cell lines. However, the genomic DNA has not been 
isolated and the exact location on chromosome 7 has not been determined. 

OBJECTS OF THE INVENTION 

30 Although cDNAs encoding the above-disclosed proteins have been isolated, their location on 

chromosome 7 has not been determined. Furthermore, genomic DNA encoding these polypeptides 
have not been isolated. Noncoding sequences can play a significant role in regulating the expression 
of polypeptides as well as the processing of RNA encoding these polypeptides. 

There is clearly a need for obtaining genomic polynucleotide sequences encoding these 

35 polypeptides. Therefore, it is an object of the invention to isolate such genomic polynucleotide 
sequences. 

SUMMARY OF THE INVENTION 
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The invention is directed to an isolated genomic polynucleotide, said polynucleotide 
obtainable from human chromosome 7 having a nucleotide sequence at least 95% identical to a 
sequence selected from the group consisting of: 

(a) a polynucleotide encoding a polypeptide selected from the group consisting of human 
5 SNARE YKT6 depicted in SEQ ID NO:l, human liw-glucokinase depicted in SEQ ID NO:2, human 

adipocyte enhancer binding protein 1 (AEBP1) depicted in SEQ ID NO: 3 and DNA directed 50kD 
regulatory subunit (POLD2) depicted in SEQ ID NO:4; 

(b) a polynucleotide selected from the group consisting of SEQ ID NO:5 which encodes 
human SNARE YKT6 depicted in SEQ ID NO:l, SEQ ID NO:6 which encodes human Uv^p 

10 glucokinase depicted in SEQ ID NO:2, SEQ ID NO:2-8which encodes human adipocyte enhancer 
binding protein 1 depicted in SEQ ID NO:3 and SEQ ID NO:&-7_which encodes DNA directed 50kD 
regulatory subunit (POLD2) depicted in SEQ ID NO:4; 

(c) a polynucleotide which is a variant of SEQ ID NOS:5, 6, 7, or 8; 

(d) a polynucleotide which is an allelic variant of SEQ ID NOS:5, 6, 7, or 8; 
15 (e) a polynucleotide which encodes a variant of SEQ ID NOS:l,_2, 3, or 4; 

(f) a polynucleotide which hybridizes to any one of the polynucleotides specified in (a)- 

(e) ; 

(g) a polynucleotide that is a reverse complement to the polynucleotides specified in (a)- 

(f) and 

20 (h) containing at least 10 transcription factor binding sites selected from the group 

consisting of AP1FJ-Q2, AP1-C, AP1-Q2, AP1-Q4, AP4-Q5, AP4-Q6, ARNT-01, CEBP-01, 
CETS1P54-01, CREL-01, DELTAEF1-01, FREAC7-01, GATA1-02, GATA1-03, GATA1-04, 
GATA1-06, GATA2-02, GATA3-02, GATA-C, GC-01, GFII-01, HFH2-01, HFH3-01, HFH8-01, IK2- 
01, LMO2COM-01, LMO2COM-02, LYF1-01, MAX-01, NKX25-01, NMYC-01, S8-01, SOX5-01, 

25 SP1-Q6, SAEBP1-01, SRV-02, STAT-01, TATA-01, TCF11-01, USF-01, USF-C and USF-Q6 

as well as nucleic acid constructs, expression vectors and host cells containing these polynucleotide 
sequences. 

The polynucleotides of the present invention may be used for the manufacture of a gene 
therapy for the prevention, treatment or amelioration of a medical condition by adding an amount of 
30 a composition comprising said polynucleotide effective to prevent, treat or ameliorate said medical 
condition. 

The invention is further directed to obtaining these polypeptides by 
(a) culturing host cells comprising these sequences under conditions that provide for the 
expression of said polypeptide and 
35 (b) recovering said expressed polypeptide. 

The polypeptides obtained may be used to produce antibodies by 

(a) optionally conjugating said polypeptide to a carrier protein; 

(b) immunizing a host animal with said polypeptide or peptide-carrier protein conjugate of 
step (b) with an adjuvant and 
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(c) obtaining antibody from said immunized host animal. 

The invention is further directed to polynucleotides that hybridize to noncoding regions of 
said polynucleotide sequences as well as antisense oligonucleotides to these polynucleotides as well as 
antisense mimetics. The antisense oligonucleotides or mimetics may be used for the manufacture of a 
5 medicament for prevention, treatment or amelioration of a medical condition. The invention is 

further directed to kits comprising these polynucleotides and kits comprising these antisense 
oligonucleotides or mimetics. 

In a specific embodiment, the noncoding regions are transcription regulatory regions. The 
transcription regulatory regions may be used to produce a heterologous peptide by expressing in a 
10 host cell, said transcription regulatory region operably linked to a polynucleotide encoding the 
heterologous polypeptide and recovering the expressed heterologous polypeptide. 

The polynucleotides of the present invention may be used to diagnose a pathological 
condition in a subject comprising 

(a) determining the presence or absence of a mutation in the polynucleotides of the present 
15 invention and 

(b) diagnosing a pathological condition or a susceptibility to a pathological condition based 
on the presence or absence of said mutation. 

DETAILED DESCRIPTION OF THE INVENTION 

20 The invention is directed to isolated genomic polynucleotide fragments that encode human 

SNARE YKT6, human Uwp-glucokinase, human adipocyte enhancer binding protein Land DNA 
directed 50kD regulatory subunit (POLD2), which in a specific embodiment are the SNARE YKT6, 
human Uwf-glucokinase, human adipocyte enhancer binding protein l_and DNA directed 50kD 
regulatory subunit (POLD2) genes, as well as vectors and hosts containing these fragments and 

25 polynucleotide fragments hybridizing to noncoding regions, as well as antisense oligonucleotides to 
these fragments. 

As defined herein, a "gene" is the segment of DNA involved in producing a polypeptide 
chain; it includes regions preceding and following the coding region, as well as intervening sequences 
(introns) between individual coding segments (exons). 
30 As defined herein "isolated" -refers to material removed from its original environment and is 

thus altered "by the hand of man" from its natural state. An isolated polynucleotide can be part of a 
vector, a composition of matter or could be contained within a cell as long as the cell is not the 
original environment of the polynucleotide. 
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The polynucleotides of the present invention may be in the form of RNA or in the form of 
DNA, which DNA includes genomic DNA and synthetic DNA. The DNA may be double-stranded or 
single-stranded and if single stranded may be the coding strand or non-coding strand. The human 
5 snafe- SNARE YKT6 polypeptide has the amino acid sequence depicted in SEQ ID NO:U 

KLYSLSVLYKGEAKVVLLKAAYDVSSFSFFQRSSVQEFMTFTSQLIVERSSKGTRASVKEQDYLCH 
VYVRNDSLAGVV1ADNEYPSRVAFTLLEKVLDEFSKQVDRIDWPVGSPATIHYPALDGHLSRYQNP 
READPMTKVQAELDETKIILHNTMESLLERGEKLDDLVSKSEVLGTQSKAFYKTARKQNvSCCAIM 
10 and is encoded by the genomic DNA sequence shown in SEQ ID NO:5^ 

CCAGACATAGGCAAGGCGCAAGG'rGATACAGTAGGCAGCCACCATGGGGGCCAGGAGGCTCC 

AGCAGAGGCCACACAACCAGCCCAGAATCCAGGACAGAGAGCTGGAATGGAGACAGGGAAG 

CCAGATACCAGGCCAGACTGGCCAGGTGCTACAGGCCTGTGGGCCAGGCCAGGCTTGGGGAC 

15 TTCGTCCTGGGTGTG A AGG AGAC AGGC ACCCCTGAGGCCTTCCCTCTGC ATCTCC AGCCC A AG 
CTAAGCGCAAACTCT1 AGGTTGGAGTAAGGAGTAACCCCCTGCCAAGT T TC T C C TGTCCTCAG 
GCTCCACCCACCACCTATGCTGCCTGGCCCCATGGGGCACACGCTCAGGCCCAGCCTGGGAAA 
GCAACTGCACCTGCCTGTGCTATGCTGGCCCTTCTCAGCCTCAATGCCCTCCTCCCTCCCCGACG 
CAGGCT^GTOGGGCXGGCT^GGeC^^ 

20 GTGTGGCCCTGCCCTTGGC T CC C CTCCACACCTGTGTCCCAGGCAGTGCCACGGCACTTTCCTA 
AACAGAAGGATGGGCTTCAAAACAGTCCCAGACACTAAACACACCTGCATTTTGGGTCCAAG 
TAACTTCTGACAAGACGAGTGCCCCTACACAC TCT CAGTCCTATCCACTATGGGCAAGGAGCC 
TGAAGGATCCCCCAGAACTGGCTAAAGCCCTCAGTCTCCTCCTCCACCCTGAGCACCTTCACGC 
GGCAGAGTGGCCCTGGATGTCAGCTTCITGCTCCCCATGGTCTGCACCTGGACAGGTGCTCTC 

25 AGGTGTGTGGGTGGGCAGGTGGCAGGTCCCAAGAGCCAGGTGCAAAGAATCTAGGCCAGTGC 
C CACGAGTGC T-G CAGTGTCTGTCCCCAGCATGGTATCTAGGGCTCCACTTGCCTATCAGCTGTA 
ATCGGAGGAGGCTTTCCAGGCCAGGCCTCCCCCAGGAAGGCTGCAGGCACTGCGGATCGTGCG 
CCCTCACATGCATT ATTCCTG A GGCCCTTC T GCA GAT GCCATCA G GGCAGCAACTCTGATGAG 
GTATTAGGGCACAGCACACAGGGCTAAGCCACCCTGTACTGGGCCAAGCGCTACAGGCAAAA 

30 AGGACACCACCGACGGGCATTTCATTCATCGCTTTTATTTTTATATATTTTTGAGAGGGAGCC 

GGGTTCAAGTGATTCTCCTGCCTCAGCCTCCCGAGTAGCTGAGATTACAGGTGCCCGCCACCA 
TOCCCAGCTAACTTTTGTATTTTAGTAGACATGGGGTTTCACCATGTTGGTCAGGCTGGTCTC 
GAACTCCCGACCTCAAATGATCTGCCTACCTCAGCCTCCCAAAGTGCTGGGATTACAGGCATG 
35 AGCC ACTGC ACCCGGCCC ATTC ATC ACTTTT A A AT AGC ACCCTCTG A AC A A AGCTCCCTGGGCC 
AC ATGAC C C T AAG GG' I TACCCCAT C CC AC C C C AACCCAGGTCTGGCAGGTC CT CAGAA C AG G A 
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AAAGCTGAGCACTGCCCAAGGCTGCTTGCTGGGCCAGTCAGAGAGGTCTCTGCCTTCCAGGAT 
CAGAAGTACAGGCTGAAAGCA GC C TT GGGC C CGCC T CCCTGGGAGGCTACAGAGG C TTCAGA 
GGGTTCCCTGAACTCAAAACCAGATGTGAGACTTGAATTTGACTTACCCCTGGTTCACCTCCC 
A ACC A A AGC AGGGGTC AGCTT TGG CTCCTCC AGG A ACC AGG AAGCTTCC AGGTACCCTGTGG A 
5 GCCCCCTCTGCTCCTGAAAAGTTGCCACCTGTGCTTGGTCK3GATGCCAGGTGGTCTCAGATTG 
ACCCTGGGGTCAGCGGTGAGGGACAGGAAGCCTACAGCGGGATCAGGATGGGGATGGGGCCT 
C CTGTCCCATCGCTCTGCAGCTATGAGGCAGC'nTCCTAGGGTGGGTCTCCTGGCTGCAGCTA 
AGACCAGGCAACAGGATTCAGCAATGACAGGGCTTCTTCTACTCCAGGGCTCCCTCACCTGGT 
mA€A GCAAAAAAGAAAATACAG r ITCCTGCTACiCAAGGTCTATAGAAAGGAGGTGAAGGAG 

10 TCAGGCCTGCAGCTACCTCTCCTGGACAGGAGCTGGTCAGGATAACTTGGACCCTTGCATGCG 
GCAGGCCCACAGGCACACAGCATGAGGCCACTCTCTCCCCCGGGGGAAGGGCTTGGTGAAGA 
A AG GAl TC C C C TGAAG CA CAAAGAAAGCACAGGACCACTG'rGAAAlTTCAAGAC AACTTr A T 
CCAGACAGGCGCCTCTCAAATAGAACACAGGGAAGTTAGGCAGCAGTTACTAAAATACAGTC 
TC GCC AAA TGAT TTACAACAGAACACAACAGGAGCAGGGGATCTGTGGGTGGGGCTGGGCTG 

15 GGCCCTCTATCTCACAGGGCCTGAGTCAAGCCAGCCCGCCCTGCAAGGCAGGGGCTGACCTGC 
A AG C GGAGATCT CACTTCCTCTTACCC C AAA TT CATAC CT CC ATTTTCCC CG CCCCCATCTCTCC 
CCAGGGTCCTCAAGTGGGAAAGGGAGAGGTAGCATCCCTCGGATCCAGGCCCACTCCACTCCG 
TCTCCGGCACCAGTGGGCAGGCTGAGTCTGGGCCTCAAGGGGCCCTGGGCTrAGGGTATCTAT 
GC^AG-T^CfGAAAA-EGACJATOG^ 

20 TACCC AG AGA A A ATGGGC AGC AGC AGGTA A ACCAGCC AGG AGGTGGAGTCCTCTG A ACCC AC 
Ag CAGACCCCACCCTCCTGCCCAGCCCCTGCCCACATTGGGGGTCAGGACCACTGAGACTCTG 
GTCAGGACAGTGGGTGCTCTCAGCAGTGTGGCAAGCTCAGAGCAGAGCTCCCAAGGACCATA 
CCACACTGGTTCAAAACCCATAGGTGACACCATCCCAGCAGAAGCTTCCATGGGTGCTGGATC 
CCAGGGCTGCATCCTGAGCACAGGTGGGCAGACTGGAACATAACACTAGGACCCAAGGGATC 

25 CAGAACAT TTTAGGCC CATCTCCTGGGCTGCTCCAGCCTGTTGCCATGACTTGGGCAGTGAGT 
GGGCCTCCTG C CAGGTGGCAGG G C AC A GCTT AGAC C AAACCCTTGGCCTCCCCCCTCTGCAGCT 
ACCTCTGACCAAGAAGGAACTAGCAAGCCTATGCTGGCAAGACCATAGGTGGGGTGCTGGGA 
A TCC TCGGGGCCGGCTGGCACCCACTCCTGGTGCTCAAGGGAGAGACCCACTTG TTC AGATGC 
ArAGGCCTCAGGCGGTrCAAGGCAGTCTrAGAGCCACAGAGTCAAATAAAAATCAATriTGA 

30 GAGACCACAGCACCTGCTGCTTTGATCGTGATGTTCAAGGCAAGTTGCAAGTCAAGGCAAGT 
G T CCCAGAGGC C C T CK jG CAGCTGAG T GCACCTGTGTTTGATC TTC CCCTGATGA TGG ACA C TC 
CCAGCTGACCATCCAAACACCAGGAAAACATCCCCCTTTCCTGGGCTCAGTTCCTAGTCTACTT 
GC T GG TA CG AACCCAACCCACACACTCCCCGCCCACAATGCAGCTCCTTCCAAATCCTCCCACA 
AGCCA (XT4 TOTGGGACTTGGAAGCTGCTTAGGATGGGCCCTGCCCTCTGCGGGAAGCCAATC 

35 CTAGCAGAAAGGTAAGCTAAACAACAGTCTCAGAATCTGAGACCCAGTGACT 
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GTTCCCCCCGCCCCAGGCCTTGGGCXTGAAGTGGGGGCCTGCCTGTGGCCTCTGTGGTGGGCTC 
AC TCCCACCC CC AACAGTGGCCCCAGGAGAGGCTTTCCCAAGAGTCTTCAAACTCCACCCACCG 
CAGCCCTAGCATCAGGGACTCCCCACCCCCCACTGGAGTGTTAATATCATTAATGTACAAATA 
AG ATCCAAAGATATACCAAA.GATCGAGAAACAGCTGGCTCCGACCTCCCTCCCACAGAGCCTT 

ATT TGCCCAAAA ATACACCTTTGGGGACCTCAAA TT CTTTCCAAGAATCACTACC AC ACATAT 

C CAGTCTGCCCCC A C TCT GG CTCCT C GTC T AT GG CTG T CTCTTCTTTTCCAG GGGCTGCAGTTCT 
GA-FCTGAAT-G ATGGTGCC ATTCC A - GC ATTGGGCCTCTGGC AGGCTGC ATC AC ATG ATGGC AC A - 

10 GCATGAGTTTTGTTTCCGGGCCTTGGAAAAAAACAAAGAGGAGCTGAGAAGGAGGACTGACG 
AAGTAAGGGAAGCCCCAATCCTGGCAGGCGTGGCAGAGGGAGCTCCACAGGACACAGCCAGG 
CAGAGAAACTAGCACTAGAACAGGGTGGGGGTGGAGGCC'ITGAGGGAAGCTGTCCACAAGC 
AATTCCCATCACCAAGCACAAGGCGGGCCCCGGCTTCCAAAACTAGTCTGGGATCCTTTTTCCT - 
TTCTTTTCTCACACCCC AT TAA T GCTATCA4AAAGTGAGTAAAATTCCTACAGTTAGGCCAGG 

15 TACAAACAA AGGACCAAT A AT A CA AATGGGATTGGCAGAATATCTTAACTTTGCCCCACTCCT 
GTCTTCAeAeAAJ^TATCTGACCACCAC 

CAACAGATGTGGTTCCCACTTGGGATGTCK3TTTGTGGGGACCACTGTTGCCACCTrCTCTCTrG 

CTTTCTGGTCACAGACTATCTTCCTAATCCCACCTAGCCATCTCCCTCCAATGTGCACATGAAA 

GCAAATGTG TGT GG ACAG ACCAAGTAAATTTGTCCCTATGA C TA TC CAACCATGG G CC A ACAG 

20 TGCCATCTCCACATAGGAAGACATGAGCACTGACCTGAGAGAAAGCGGCAGTCAGCAGCACC 
CATCCTTGTCAATTAAATATT1TCTGTCAAAGGGAAATTAAAAGCTTAAGAACCTCTTCAGGA 
AGGCTGAATT - GCTTGCATCTTAAAGACTTATGTCTACTCAGCAGAAAGAGGAATAAGATTCA 
ACAGTAAATCTC TGGTGATCAGA AC TTGAACCAGC CTTCCTGGACTGGGAG T AG G A G TTC A G 
AAATCAGCCAGAGCA G CAGAGGGCAGAGC A G AG GC A GGAGTGG AACAAGGCCTCGGCCCGC 

25 ATCGACTCCAACGGCGCCCAAGTGAACTGCCTCCAACCACCTG GGCCTGA GG C GCTCAC CTTA 
GGCTCTTGCCGC AC AA G GAA TCATCCACCATGATTCAACAGTCTAAGAAAGACCCGTTCATAG 
TGGAGAGTGCCAGAAGCAGCAAGCTGCGACTGCTCTCTAGAGAGAACACCCAGGAGGCAGCA 
GGTGCTGGGI^^CTCACAGTTTTATAGAAGGCTTTAGACTGTGTTCCCAGCACCTCGGATTrGG 
ACACCAAGTCATCTACXriTCTCACCTCGCI€TAACAGAGACTCCATGGTGTTGTGCTGGACAA 

30 AAAAGAAAAGAGAATCCAGCTCTGTTCAGTACGTGCCCTGACATGAGCCCCTCATATTTCAGT 
CATGGGGGAA AGTnCCTrTACCTGGGTTCCTCTCC A AC AC AC AC A A ACTTC ACCTCTAGGTGTC 

gagactcggtccaa gaa tagt tactgtccaagtggatggaacagaacctggtgacattcccg 
tgaaajctagaa43a^taactgggatgtack:agact^ 

c ttagataaccagcactccaggaaaactcatatatatatatacacacacatttatatatacat 
35 ttgtgtgtgtgtgtgtgtgtgcacgcacatgtgcgtgtgcatggagctttggaaaaaagagt 
agcrgggcactatatgattgtactgggttggagagtgacccacaccgcaccccccaaccccaa 
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CC GCATCC C AGAAATTAACATCCCCAGAATC TCTGA ATGTGACCATATTTAGAAA T AG GGTCT 
TGATGTCCTGGTAAGAAACGGAAACACA CAC AC AG AAGGTCACGTGACGGCAGAGGCAGAGC 

5 AGAGAG GCCTGGG A C A G A C A CTCA GAGCCCCA AAAG ACA C CA GCCAGG C C CACAGAGCTAT C 
TGTTAAAAGCAAATATTTGAGGGTTTCTGTTGACAGCAGCCACAGGAAACAAAAGGCGGTGG 
GAAATGGCTATTGAGCACTTGATGTGAGGCAAGTCCAAACTGAGCAGCGCTCTGAGTACAGA 
CACACCAGATTTCAGATGCAAACTCACACATGCTTCATTAGTAAGTTTTATACTGAAAAAAAA 
A CAAGT TT TATACC G ATTACATGTTGGAAAAATTGTATTTGGATATACTGCGT T AA G TAA a ^,^ 

10 ATATAATTAAATTAAATTCTACCTATTTTCCTTTTATCATTTTAAAATATGGCTCCTAGAAAAT 
TCTAAGTTACACACATGCCCCAAATATATACCAGACAGCACTATGACAGAACATGTCCTGCCT 
TOAAATGGGCTATGTCCTAAATGTCATCACTACAAACTCTGACTTAGGAAATGAAAACACTG 
ACCCCATGGGAAGGGGTCTAGAGATGGAGACCTCACAAGAGCCAGCAGCTCTGCTGCCAGGG 
CCCTCAGGAAgCAGCAGCTCGCTrCTCTCCTCAGATGGCCACTGCTGCAGCAGCTAGATGC AC 

15 AC ATG A AGCGCC AT AG A AC A AGG AGCC AGC A AGA ATGTCCTTC ATCCCT AC AC AC AGCTG AG 
CGA CTCAAATTTTTAACACAGA AAGTT AA CT GAT T CAGA T A TGC A C A C C A A T CA T CTAGA T TT 
TACAACTGCAG CT A GAT GAGGCTG GGT G AAT AGG ACT CATCCACTCCC C A CC GTGGGGAGA G- 
GAGAAACAGCGGGTGTCCCAGGTG T CATGGTACTCAGACTAGGACTTGAGCAACAGAAAGAG 
ATGGCITGAGGAGA AAACGG AGAAATGCCACCTAGGTGGTAAGAAAGCTCACAAGGTTTCA A 

20 A AGACAC AGATACCATGAGACTTTC AC AT C TATCGTTCATTC C AAAGC CACGTTATTTGGAGT 
GCAGTCAGC A CACCTGTGTTTGAAGCCCCTGGGATGCTTTT T A T AAAATGCAGGTTCCCAGGC 
TCCA TC G C AGGCCAACAACTCCAACCCCAGGAGACG C TGATGTACACA CT AAAGC TATGCCTG 
TGTAAATGGTAAAGC T TTGTATGTGGGTTTCAATCCACTCCAGGTATCTATCAACTGCTGAGC 
ATGGTATAAACTAGGCACTGTATCATGAGCAGGATGGAAAGATGTCCCAGTGCTCATACGCT 

25 GGTCAGGGAGACATGTAAACAAGCAGTGACAAAACTGTGACATCTGGTCAGAAAGGCCCAAC 
C3TC AGGCGC CTGTCmTGAGCT G GGC AAGAAAGGGTATAA 

GAGACTGTGAGTTAGTTTGCACTTrATCCTGGGGCGGATCTGAGAGCTGCTGAAGGGTTCTAA 
GCTGTGCAGAJCAATGACTACTCTCTGGTGGACAGACTGGAGGTGAGCAGGAGGCAAGGGGA 
CCAC F rAGAG GCAAAG GC TGTAAGA G AAAAA C CTGAGAAAAACAGATAGCTGCITACATrCC 
30 ACTTGTATGCAAAAATTTAAAAAAAAAGAGTTGAAGCAACAGTTACAAATCAGGAGATTTCA 
GC rC AAAATG C AGGGTTCTGGCTCTTTTO 

CCAGAAGCTGCCCTGTGGTCAGTGCACGGTGCTTCAATCTGTTCACCTTCAATGCAAACGCTG 
CAAGGGGA GG CACCTGTGGGGTGTGGAG G CA CCC GAAACCCTAACAAAGGCACCAGGGTGGG 
AATCCAGGTCTTCAGAAGCCAAACCCTAGGAACCCAGTAAATGGTCAGACAGGCAGTAGCCA 
35 TGAGGAAGGGAGACTTGAGGGTTCCACTGGTTCCCAGC TTGGT CCCCTAGAAACAATGGGTG 
CCATTAACCAAGAGAAGGGTATAGGAAAGACAGTCTGATGCCCGGGGTGGGGGAAGGGGT-G 
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ggcaatcccacttgctggagagtgccgtggttactattatattaaaacgaggatggatctgtg 
c atgcctggccagtggaaatcgcacccccgcctcagttcttgggcttgctctccatcttcctgc 
ttaccagaatgattttggtctcatctagttcggcctgcactttagtcatgggatcagcttctcg 
tgggttctaggaaagagtgaaaaataataaagtcaggactggagtggctacctgcaaacaa 

5 aacctaaaactgaggaagctggacaaactttcacaggttaaaaaccacagcctgggccgggc 
acagtggctcacgcctgtaatcccagcattttgggaggatgaggcgggtggatcaccagaga 
tcaagag'it -c gagaccagcctgaccaacatggtgaaaccgtctctactaaaaatacaaaaat 
tagccaggcgtggtggcacatgcctgtaatcccagctactcgggaggctgaggcaggagaat 
cgg3h gaacc c aggaggtgcaggttgagtgagccgagattgcgc c ac tg cac tc ca g c ct gg g 

10 aaacagagtgagactccaactcaaaaaacaaaaaacaaaaaaaaaaacccacagcctgttta 
acatgtaacagaaacccaaagcctgcctagagcttgggttccccggtctgaacgtagattctc 
tottttccaaacagtaaggcrtgagagaggacaccagcatcagaagctgtcagaagtaatta 
gaccagaactatcagggcagttggctttttcagtttcacatggattctgggccacatggtgtc 
tgctgaagcttcctttaaccctagctggtatctactgaggtgaccatccagggctgggtaatg 

15 gattgtagcaggggatcctactggccagtctatcctgtcgacttgcttggagaattcatctag 
ta c ct g c aa gacaaaggagac tca a caa gcctcccactgtgcactcaccy\gtggt ctcaatga 
cagggcrj'c a cc cctgagcac ctc a ccc tgaatgaggctcc'ltggccrrcacagcccaggaag 
ga ggaatga gggggacatataatggcaacagagaaaatctaggctaaagttctttccaaatt 
ma tcattaaaacatatcctaaatattctgagaatcaaaagtatgcccagcccgag a tgaac 

20 c tcac t tg gggagtaataaaggtatttgaattttaaactacagatttccagaaaaaaggggc 
actggtcctctaa ttttcc aaagcaatttt t ta a aaa a gagaattaggtcccctaga t tta 
aaaccacc a gattccatgtgtttggaggta ttttg gtgctctggggtataggatgaagcctct 
gacttcaaaciagttaatattagtaattagc ac c g tacgcaaaaaaatttaaagaatgcttag 
gtociaai3clgtotogtck:aactgactga€atcaagg 

25 gcaatggggagagtgaaggcattcaagagggagactccttgagcagaagcacagggggcga 
gaacacaaggcacagctgtctc c gag g gtcccatcc ca ga g aatagatgctatgactcagtg 
gcctagacccagctcacatgagggacagcaccggggaggaaacccatacagggatgccaaat 
totctcttgggttgcagggaagggggctgaaaaatgtgttgactttggacacatcatttcatc 
cc'itatgtctcagggactgccatcaacccctgtcccagtccataaatgtgcccatrcatcatcc 

30 aagtccaggagaggcaaataaaaaactcaccttctccagcaaggtaaaggccacccgggatg 
ggta ttc a tt gtc ack:aatg accacacctgcaag a ctatcattccggacgtaga cgt gg c aca 
gatagtctaaggagacaagagatcagacacatggatgctgacatgagggc1tcagacttctt 
ttaatccccccaaatcaaagcatccaatgttaggccaaatgaagccactcggaagctcaatag 

35 ggcccaacgcctgctttaggaccagtaaatacccagaggcccagtatgcaaagccagggctta 
aagaaacagccagtck3tgcagaaaacacacccttgacaacatggccccaggagcatttccaa 
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GTGTATTCCTTAAGCTCGGGTCAGGCCAAGCTATATCTTAGGGATCTGGAGCCCTTGGGGCTC 
TGTGCTGCTCCCAAACTTAGGGAACCCTGGACAAGCCAAGAGGCCTCTGCTTTCTTAAAAAA X 
CTTTTCAGAGCAGCCAAAAGACAGGAAATTACCCCCCAGGGCCTCAGTCTTCCATATTATAGC 



5 G A'r "J-AGCT A G -rGCGAGC ' rA AQGCCTCC AGA GC A CTG3'' G C TA A A A T CCTCATATGATTGAAAG G 

TACAGTTGTACAG GTCTTGC GCAAAATATTCACAATCCACAGGATTGTTCA TT TC C AT C AC TT T 
GAAAGGATTCAGAGTTGATACAGCTAACCATATCCCCAAGGAAAGAGAAATGTAAGGATTAC 
AGCT T AC A AA T AAGAACCTTCTTGTCCTTAAGGATCTGACCCAGAAGATTCCAATGCTAAACA 
ACAGAAAAACAAA T AAAAGAGGAGGGAATGATGGTGAGCCCCTG A AATC A G AAA A G AG C A G- 

10 AGATAAATGAGAACAAGAATGAGGAGGAGGAAGAGGACAGGGGGTTGTCACCAATGCTCTC 
CAGATTTTGTATACCATCCCCAATTAAGATTCAAACATGGGGTCAAAGTGCATACCCTCCAAA 
GAAACTGAGAACCTGGTCAGTGGAGGAATTGTC1TTAAGTAATAAACGTGGGAAGGGCAGGG 
ACAGTTTGAAGAACAGAGCAAGAACACTGAAATATTTGTGATGCGATTTCACTTCTATGATG 
TTAATAGCACAGAGATCCCACATAAAGTGTATATAGTCAATCCTGCCTGTATCATAACTGACA 

15 TTTATATCATCAATTCAGTAACTCTATGTCACGTGACTTGAGGTTAGCATAAGTGTGAGATGA 
T CTTTGTCCCTACCTGATGAAACTCATGTAACTCTTTCCTGATCTGTCTGTATAACATACACAT 
CTAAATAAATGCCTAAACCTGAATTATCAGAAAGAAAAAATAGTTTTTTCAGATTCCTGATCA 
AAAAATCTAC'GATGCACAGAATACATATAGTACCTCAACAGTGCTAGCTGGAAATCCTTTTTT 
GAGGGGTCTGCAA CT C TG AAGAGGA T AGGGAAGAAT A CG A TATGAAGGCTGCTTACTGCTCC 

20 AAAAG AGTC A G AC C CT AATCTTA A A T G AGTC T A AGTTTG AGGGC AATTTTATCTGGG A AGCT 
CAGACTTCAACAGTGGGCACAGA.A.TTCTGCATAAATAGGAAAAGGAAGAGGTGGGAAAGAG 
AGAACAAGCTAGAGGAGGAGTAGGGTCCCAGTAGAAAGGAGAAAGCTGGGTGCTATGTGAG 
GTGAGGCATGGCAGCCAGGCCAGCACACGCACAGAAGTTGGAGGGTCTTCTTACCTTGTTCTT 
TGACAGAAGCTCTAGTGCCTTTCGATGAGCGCTCCACAATCAGTI'GACTCGTGAAGGTCATGA 

25 ATTCCTGAACGCTAAGAAACACAAAATGTATTTATTGCCTACTTCTTATCACCTTGTCCCCAAC 
ACAGTGGAAAGTGACCTCTGGGCTTATACATTAAGTAGACATTGCTTCTTGGTTTCATTCCTT 
TCCCTCCCATCCCTAGTAACAAACACTCTATAAATGAGCACAAATACTGATAATTATGAATTA 
TCATCACCATGAAAGCTCCATCTGTTTGCTACCTGGCTCACCAAAACAGGTGAATTTTCTGGG 
G GGTnTTCCACAGGATACAGTCAATTlTACATTITGGTGAATGCATAATITGGAAT 

30 GAAAAACAAGAGGCAGGTCCTGCTCTCAAGGTCCCAATAACTTCCAAGAAGCAGGACATTTA 
TAAGAAC T GCACTAGAAGAATA G TG TG CAAA A A C TGTCAGGCAGAAATGCAC A ACCATTTAT 
GGCTGTGTCCACATGACAGACCCTCGCAATGCCACATACACCCATAGTGAGTGCTGGCTCAGG 
TCK?CTO GGGC T CG T CCAC A GAACGA GCGCAAG A C ACTCTGGATG G AA CAAA AGGAAAACTC 
C£CA^CX:AA GA C AAAG A A GTGGGA A ATGG C TCATACA AA G GG T GA AAGG GAGAAGGTCCAT 

35 CATGGGCTCAACAGAGAGATCTATCCAGAACAGAACAGTCACAGGAGATGGTACAGCCAGAG 
GAAGAGGTGCTGACAAGGAGCCTCCAACTGAGGATGTGATATAAAGGGCAACCAGGGCCATC 



AACCfGCTGGGTTTGC: 
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AAAGCAGGGTGCTCAAATGGGAGTCTGCAGCAGGCTCCAGCAGAGCCATATAGGTAACTGAA 

TAAACGAGGTAATACGTGTAAAGCACTTGGAACAATGCCTGCACACACAGTATTACTTGTTA 
ATATCTTGAGGGACTGAAGTGATCAAAATAACCCCTCAGAAAAGAAGACCTCAAACAAGGAA 
5 GGCTTrGCAGTAAACCTAGAGACAGCAriTGAGACACGGCTATAAAGAGACAAAGGAAGAAC 
TGCATTGTGACAGCATGTATACAAAGACCAAAAAAGCTGGGAAACTACTTTTTCAACTTTGG 
AATCGGGTAATTATAGGGCACAAAGGACGTAAGTAAAGCGGTCTTATAAGAAAACAAGCTCA 
GGCCGGACGTGGTGGCTCAAGCCTGTAATCCTAGCACTrTGGGAGGCCAAGGCAGGCGGATC 
ACTE GAGCTCAGGAGTTCGAGACCAGCCTGGCTAACATGGTAAAACCCCATCTC T ACTAAAAA 

10 TACAAAAATTAGCCGGGTGTGGTGGTGCGCGCCTGTAATCCCAGCTACTTGGGAGGCTGAGG 
CAGGAGAATCACTTGAACCCAGGAGGCGGAGGTTGCAGTGAGCTGACACTGTGCCACTGCAC 
TCCAGCCTGGGTGACAGAGCAAGACTCCATCTAAAATAAAATAAATAAATAAATAAATCAGC 
TGGGACATGTGTTGTTTTAAGACATATTAGTAGA\GATGTCCCTTTAGTGTTGCAGCTGTTAGT 
CATTGGAAACTAGTGTGGGCATCCCAAGCAGGTGAGGTATAAGTCCTACAAGTGAAATCTCT 

15 GAGAATCTTAAGTACTAATGGGAAGGAAAAAGGAAAAAGAATCAGAGCCAAG1TGGCACCA 
AAAGTTCCATCTGAGAAAAGCAACAACACAGAGCAGTGAATGTAGGCCATGGTAAAGACTGC - 
AAAGACCAAGAACCCCAAGAAGGAGCTAAAAGATAATGCAGCAA1TCCGC1TCTGGGTAAAT 
ACCAAAAAAATGCGAGCAGGGTCTTGAAGAGATATTTGTACATCCATGTTCATAGCAGTATC 
ATrCACAATGGCTGAAATC^^^ 

20 AAATGTGGTGAATAATACAATGGATTATTCAGCCTTAAAAAGGAAAGAAATTCTGATATATG 
CAACAAGATGCATGAGCCTTGAGGACATTATGCTACATGAAATAAGCCAGACACACAAAAAC 
T AT ATG ATTCC ATTTATCTA AGGTCGCC AG AAA AGTC A A AATC AC AG AG AC A AATT AG A ATG 
GCAGTTGCCATGGGCTGGGGGAGAAGGGAATGTGTTTAATAGACACGAATTTGATAAAAAG 
GAGTTCTrGGAGACGATrGACAGTGATGGCTGCACAACACTATCAATC1\ATlTCATATCAATGC 

25 ACTCACTACACGCTTAAAGATAGTGAAGATAAATTTTGTGTACCATTTTACCACAATTAAAAA 
IA-T-TTTTTTA A AA GAACTCAAAGAAGCAGAAAGT^ 

CATCCAGCAAGTCCTTGGCAAAGAACTCTCATCAAGAACCAGCTGCACTGAAGCAGGGAAAA 
CAGAATCCAAACGGCAGATTCCATCAGATTTTGAGACAAGATGACCATAGATACCGACCATG 
TAGGGTCCTCCTTCTTTCGTGCCTGAGTCACCCCAATCCCTCCCACGAATGGTCTGGAAGTGTC 
30 TGTGTTACTTCTAACACGTTCCAGCAATTAAAGCGCCCCAGAAACAAGTAAAAGCCTGTAAGC 

CTAAAAAGCGTTAAACCTGGCTTTCAGTTCTAGCTGGTTGTGATATAACCTCTTGGTACCTCA 
GTGACTTCACCCATTAAAAACAAACAAAAAAAAGTATATCACTATCTCTCATACAGAATTGTT 
GGGAAGCCCCGCAAGAAAATCAAAATATGGCTCTCAAGATGCGGCACCCAAGCTCCCAGAGT 
35 CAGAATCACTGGGTGGGAAGTGTTGGTCTAAAATATAAATACCGAGGCCTCAATCTACTAAT 
TCAGAACATCTTGGCATGAAGCTTGGAAATCTGCACTACTrc 
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CACGACAGAAATTTGAAAAACACTGAGTAGAGAACTATATTCTAGAATGGTATAAGCTCTTA 
AAGAGCTAATC aTOG TTCCTCAAAGGTAGAGTCCACGGCCAGATTCCATTATAGGAGA GGAA 
GCCCGGACAGCAGACCCCGGGCCCTCCCCACCCCGCCCCGCCTCTGACTCGGACACCAGCCTTC 
TCAGACCCCGGGCACTCGGCCACCCCGCCCTGCCCCTACCCTTGGCCTCCTCCACCCTCCCCTCA 
5 TCCCTCCGCCGACCCCAGGCCCACTCCGACTCGGACCCCCACCCCAGTCCTCTCCGCCCGACCGC 
CACGGCCCACCAGCCTGTGCCGCTCACCTGGATCTCTGGAAAAAGCTGAAGGAAGACACATCG 
TATGCGGCTITGAGCAGCACCACC1TGGCCTCGCC1TTGTAGAGGACGCTGAGGCTGTACAGC 
TTCATGGTCCGGCCCTCAGGCCGCCCGCCTGCCCAGCTGCGGGACCCGTTCTCAGGGAGCAGC 
GC GG C CG C C G C CCCTC G GG. A C CGCCGCCGCCTACCGGCCTCTCAGCAGCCGGCTGCTGACGGG 
10 GCCACCGCCGGCTTCCTCCTCCTGGCTCGCAATCCACTTCCGGATCCGGTCAGCCTGGTTGAGG 
G1TCTCATACTCCGGATGCAGAAATGTGAGCCCGGAAGTACAATGCAGCGAGGGGCGGGATG 
C CACGCCTCGCGTAAGCrrGGCCCCTCCCTGCTCGCCAGGTGGAGTCGGGCGCGCGGCGGGAT 

TTTTTTCTTTTCT1TCTTTTTTTAAAGTAAGCAT1TTTTTTATTATTATACTTTAAGTTTTAGGG 
15 T AC ATGTGC AC AACGTGC AGGTTTGTTAC ATATGTATAC ATGTGCC ATGTTGGTGTGCTGC AC 
CCATTAACTCGTCATTTAGCATTAAGTATATCTCCTAATGCTATCCCTCCCCCCTCCCCCCACCC 
CACAACAGTCCCCGGTGTGTGATGTTCGCCTTCCT 

CACCTATGAGTGAGAACATGCGGTGTTTGGTTTTTTGTCCTTGCAATAGTTTGCTGAGAATGA 
T G GTTT CC AG C TT CATCCATGTCCC T A^ 

20 GTATTCCATGGTGTATATGTGCCACATTTTAGGAGGAGCTTGTACCATTCCTTCTGAAACTAT 
TCCAATCAAAAGAAAAAGAGAGAATCCTCCCTAACTCATTTTATGAGGCCAGCATCATCCTGA 
TACCAAAGGGTGGCAGAGAGAGACACAACAAAAAAAGAATTTTAGACCAATATCCTTGATGA 
ACATTGAAGCAAAAATCCTCAGTAAAATACTGGCAAACCGAATCCAGCAACACATCAAAAAG 
C n\ATCCACCArGATCAAGTGGGClTCATCCCTGGGATGCAAGGCTGGTrCAACATACGAAAA 

25 TCAGTAAACGTAATCCAGCATATAAACAGAACCAAAGACAAAAACCACATGATTATCTCAAT 
AGATGCAGAAAAGG C CT1TGACAAAATTCAACAACCCTCATGCTAAAAACTCTCAATAAATT 
AGGTATTGATGGGACGTATCTCAAAATAATAAGAGCTATCTATGACAAACCCACAGCCAATA 
TCATACTGAATGGACAAAAACTGGAAGCATTCCCTTTGAAAACTGGCACAAGACTGGGATGC 
CCTCTCTCACCACTCClWrCAACATAGTGTrGGAAGTl^CTGGCCAGGGCAATCAGGTAGGAG 

30 AAGGAAATAAAGGGTATTCAATTAAGAAAAGAGGAAGTCAAATTGTCCCTGTTTGCAGATGA 

CTTCAGCAAAGTCTCAGGATACAAAATCAATGTGCAAAAATCACAAGCAGTCTTATACACCA 
ATAACAGACAGAGAGCCAAATCATGAGTGAACTCCCATTCACAATTGCTTCAAAGAGAATAA 
AA TACCTAGGAATCCAACTTACAAGGGATGTGAAGGACCTCTTCAAGGAGAACTACAAACGA 
35 CTGCTCAATGAAATAAAAGAGGATACAAACAAATGGAAGAACATTCCATGCTCATGGGTAGG 
AAgAATCAGXATCGTGAAAATGGCCATACTGCCCAAGGTAATTTATAGATTCAATGCCATCCC 



12 



SUBSTITUTE SPECIFICATION-MARKED UP VERSION 



TATCAAGCTACCAATGACTTTCTTCACAGAATTGGAAAAAACTAAAGTTCATATGGAACCAA 
AAAAGAGCCCGCATTGCCAAGT C AATCC T AAGCC A AA AG AA C A A A GCTGG AG G CATCACACT 
ACCTG ACTTCT A ACTj\T ACT AC A AGG CTAC AGT AACC A A A AC AGC ATGCT ACTGGTACC AAA A 
CAGA GATATAGAGCAATGGAACAGAACAGAGCCCTCAGAAATAATGCCGCATA TCTAC AA GC 
5 ATCTGATCITTGACAAACCTGACAAAAACAAGCAATGGGGAAAGGATTCCCTATTTAATAAA 
TGGTGCTGGG A AA A CT GG C T AGCC ATATGT AGA A AGCTG A A ACTGG ATCCCTTCCTTAC ACCT 
TATACAAAAATTAATTCAAGATGGATTAAACiACrTACATGTTAGACCTAAAACCATAAAAAC 
CCTAGAAGAAAACCtAGGCAATACCATTCAGGACATAGGCATGGGCAAGGACTTCATGTCTA 
AAACACCAAAAGCAATGGCAACAAAAGCCAAAATTGACAAATGGGATCTAATTA.AACTAAA 

10 G AGCTTCT GCACAGCAAAAGAAACTACCATCAGAGTGAACAGGCAACCTACAGAATGGGAGA 
AAATTTTTGCAACCTACTCATCTGACAAAGGGCTAATATCCAGAATCTACAATGAACTCAAAC 
A AA -nT A CAAGAAAA AAACAAA C AACCCCATCAACAAATGGGCGAAGGATATGAACAGACA 
CTTCTCAAAAGAAGACATTTATGTAGCCAAAAAACACATGAAAAAATGCTCATCATCACTGG 
CCATCAGAGAAATGCA/VATCAAAACCACAATGAGATACCATCTCACA CC AGT T AGAATG GTG 

15 ATCATTAAAAAGTCAGGAAACAACAGGTGCTGGAGAGGATGTGGAGAAATAGGAACACTTT 

GGGATCTAGAACTGGAAATACCAnTGACCCAGCCATCCCATTACTAGGTATATACCCAAAGG 
ATTATA A ATC ATGCTGCT AT A A GG A C AC ATGC AC ACGT ATGTTT ATTGTGGCA CTGTTC AC A A 
TAGCAAAGACTTGGAACCAACCCAAA TG AACCCITCTTTT^ 

20 AA G TC TATG GA TAGG AATGAGTGAGGCACAGCTCCCTGAGGATGCCATATCTTGCCCGTTTCT 
KT TGTATr AAGTGACATCACGTGlTACCA/^CTAAACCGGCl'GCA T TTGCCTGC GC ACAACAT 
A AAACCAAACACCCAAGCATTGGATTTTTGTAGCAAGAAAGATGTATTGCCAAGCA GCCTIG 
CAAGGGGACAGAAGACGGGCTCAAATCTGTCTCCCAATACTTGCTTCGCAGCAGTAGATTTAA 
GGGAG A GA T rJTG GA AG TGGA GlTTCGGGCTGGACGGTGATTGGCTGAAACGAAGAAGTGTT 

25 TAGAAAATCTCTTGGTCATGAGCTGTTGCTTCTTCATGCTGCTTCAAGGGTCACATGCAGATT 
CAGGAGGTGGTATAAAACAAGCTGTGGGAATTTGGGCTGTGACATCAAAGGGCCGCTCCTCG 
GGCTAGTAAGTCTATTTTGCACAGGCTCCAGTCAGCCATATTGGTTCCAACCTGTTCCAGCAA 
GTT GT ATAAGCA GAGGGGATTATAGCAAACTG T TTC C TTATCGGCTGCCCTGCAAGACAAGCT 
CAAGATTrCTGTTAGTrx\CCAGlTTCTTTAACCCTGTCGGGCACAGTTTCACATGTAATCAGA 

30 AAGGAACTTGCAAGACACATACAACTGAAAGAAACTTGG T CTTTGGAAGTTGTCAGTAAGGT 
CACAAAGTTGTGATG C TAGAAGC A GCCGTATC TG A G ATTATG G G A AAGAGATG A TATATTGG 
AAAAACAACAGCATCACTTTAAACA1TACTCTAAATCAAGGTTTCTCAACCTTGGCACTATTG 




CICAGGCCPTCTAGAAATGTCA^ 
35 AAA ATGTCCTTTGGG AGC A GTAGTTTTG AG A A AC ATTGCTTTGC AG AT AT AT ATGTTTGTTTG 
TrTGTTTTGCTTTGTGAC^3GGTO^CTCTGTTGCCCACK^ 
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CACTCACTGCAACCTCTGCCTCCCAGGTTCAAGCGATTCTCATGCCTCAGCCTCCCGAGTAGCT 

CACCATGTTGGCCAGGCTGGTCTGAAACTCCTGGCCTCATGTGA TCCACCC ACCTCGACCTCCC 
A AATTGCTGGGATTACAAGCTTAAGCCAC T GCGC CCAgCTC AGA A A CATT GCTT T AAATAA T C 

5 TGTGGTGAAAGGAAGTTCCCACCACCTGCC^ACTCACTCAGTACCTCTGTCACCAACCCTCTTC 
CCTGGGTGTTTCCA AG T ACAGAGGGTGGAAAGGGCTTTTCCACATTTCCCCTGTTTTGGTAGT 
AAAC Ar r AG G AACAGCCA'ITGGCCG TG GCTAGGCTCAGCCACCCACAGATATGGACACAGTA 
GTCTGACAAGCTGGGTTGCTGGGTGCTATCAGTCCAGGCTCAACTGCTTGCACTGACACCATT 
TCCCTATAGGAGGCAGGTGy\GAGCCATTTCTGAGGAAAGTCTCTGG AG€CCCTCTTCCTTCCA 

10 CTGAAAGTTGTGCAAAAAGATCAGGAAGA CA GC GCTTGGAT GGAATAAATTTCAGTGTATCC 
ACTTGACACATTATAGTGGCTGTCCCAAAGTTTACCTTATGCCAAGTACTTTCCATGTGCCACA 
I CAriTAATCCTCACAAAAACAGGGGAAAATATrArrGCCACCCTACAGACATAGAGACTGA 
GATTCAATTTAAGGAGATGGTTGGTAAGGGACAGAGTTGGGGTTCAGATGTCAACAGTGAAA 
T GCTTAACAAACTGTCATG<^AGCCCACTCCTGGCAACTCTTCCTGCTCCTCTCTGGCCTCACTC 

15 AGCCTCTACTGTTCCAGGAAGCCTCATTCATAGTCATGTGGTTGCAGACTTCCCAAGCTCACTG 
TOXTACCAAAAACCAAGACCTGCCTTCT^ 

G- TCC C A GCACTG ACACATCACAAAATCACAAAAGTGAGCAAACCATTACCTCCCTGAGTCTCC 
TTTTGTTTTTATCTATAAAACTAGAAAAATATTCTTTCCATAGGAATGTTGTTGGAAATAATA 
AAACAHA3^AHA£AAQC3^^ 

20 TCTTCTCATTAATGAAGAAAAGGATTATTAATCATAGAGGGTGGAAGGCATCTATGGGAAGT 
AGAGATTTGAACtATAGGCTAAAACCCAAGTA4GGCCTCTAGATTAGATAATAGTATTGTATC 
• TA TTTTAATTTCCTGCTTTCCATCACTGTGCCATGGTTATATAAGAGAAGTCTTTGTTTATAGG 
AAATATACACAAGAATTTAGAAGTAAAGGGACATTGTGTCTGCAACTTACTCTTACAGGGTG 
TCTGTGTGTGTGTGTGTGTGTGTGTGTGAGAGAGAGAGAGAGACAGAGAGAGAGAGAGACA 

25 GAGAGAAAGAGAATGATAAAGCAAATACAGGAATCAGGATGAAGCGTATCTGTTTGTTTGTT 
TTGCTTTGTGATAGGGTCTTGCTCTGTTGCCCAGGCAGGAGTGCAATGGTGTGATCCCGCTCA 
CTGCAACCTCTGCCTCCCAGGTTCAAGCGATTCTCATGCrrGTATTGTTCTTGCACCTGrrCTG 
CAACT ACAACATTGTGGGAA.TGGAAAA.TGCAGGAAATGGGCAGTAAGGCTATGAACGAAGC 
CCGCACAGGAGTGTGGGTAGCAGAGTrCTCTAGTCCAGGCTCCCACCTGAGGTGCTGGGACCT 

30 AGAAGAAAAGCCTCTCTGCAGACAG AACTGGA GTTAACGCTGTCCACGATAAATGGCCCAGG 

CACATAAGGTGCCAGCCCTTGCTGTACAGCAGAAC CT TTGGGGAGCTGGACAAAAGCCTATCA 
A GGAGCATACCCCCAGGAAGCCCAGTCCAG GTGGGGAGCCCAGCC ACACAATGGCCCTTGCCC 
CCACACCTCCTCATTCAG TC AGC T AAGGCCATGGCAGCTGAGCTGCCTCCACAGCTCATATAG 
35 GAAAAGGGTGTGGAAAGGGGCCACCAATGTGGTCAGGCCTCCATGGCCTGAGTAGGTCACCA 
AGC CT CAGGTGCA C A G A CTTG A T GT C AT C AATCAGGGT CTGTCAG C ACACCTAG CCCTCAGGA 
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ACACTGCTCCCCACTGCAACCCCACACCAAGGCATCCTGGGCTCCCTCTGGGTTCTCCAGGCCC 
CAGGGAAGACAGACAGAGTCTGCCACCAAAGGTTTGAGCTCTGCCACTGGCTACGAAGCAAT 
AGGGGATGTCAGAGCAAGGGAGGAACAGGACAGGAGTATACGTGGGCAGGAAGGGATTACA 
GCCA AGGAAGACAGGAGGCAGGTGCCCTGATTTTGAGGCTGTGCCCCAGCAGGGGCTTCCCA 
5 GAAGCTGTATTTGTCCTAAGACACCCCTCTGCAGCTGAGGGGCTAGAGATGGATATGTAGCTG 
TGTTAGGCCATTCTTGCATTGCTATAAAGAAATACCTGAGACCAGGTAATTTATAAAGAAAA 
GAGGTTTCATI'GGITCACAG'ITCTGCTGGCTITGCAAGAGGCATGGTGCTGGCATCTGCTCAG 
CCTTTGAGGAGGCCTCAGGAAACTTACAGTCATGGCGGAAGGCAAAGGGGx^AGCAGGCACAT 
CACACAGTGGAAGCAGGAGTGAGAGAGAGAGAGGCACTGGGAGGTGCCACACTTTTAAAG A 

10 ACCAGATCTCGTGTGAACTCAGAGCAAGAGCTGACTCATCACCAAGGGGATGGCCCAAGCCA 
TTCATGAGGGATCCACCCCCATGACTCAGACACCTCCCACCAGGCCCCACCTCCAATATTGGGG 
ATTACAATTCAGATGAGATrTGGTOGGGACACATATCCAAACCATATCAgTTATCAGTAGCGA 
TACTGGATGAATGCCAGGAACTTAGAATTAGGACACATGGTCATTTAGGCAAGTGGCTTGTC 
CTGTCAATGGTACCCTGATAGTCGTGGGGTTGCCCCGTACAAAAAGCGAGAGGAAGTCTACA 

15 GAGCTGTCAAAGAGGGGCAGGTGGAAAGGCCTGCAGAGGAGTCCCCTGCTCCACAACCAGGC 
GTGCACCTCCCACATCCTCGGGGCTGTAGGCCCCACATGAGAGCAGAAAGAAGGATGCAGAG 
GAAGGCC 

AAGAACACAAGGTGTGCCCTTGGAAAGGCTGGGCACACCAAACACAACCTAATAAACAACAG 
CAATGAGGACA^GGGAAAGT^CT^ 

20 CCCTGCCACATGGGGCCCCAGGCCCCAGCCTATCAACCAGTGGTCCTTATTGCCACAGCGATTG 

XA GGCC ATCC A ATC AG AGC A A AGCCC AGGACTTTCTTCG ACTCTTA AGAA A AGAG A AGC A A A 
GTAACTGGCACAGATTGGAGAGGATCAAGGAACCCCGAGCTGGATACATACAAACTTTGGGT 
TAACATGGATGATTAAATACATATGTTTATGTGAACCACCTCCCAAATATGCTCCACTATAAT 

25 GACACAAGACAAAGGGCAGGGGGAGACCAATTGCAAGGTGGCGCAAATGAGAGATGCTACC 
AAG GGTGGCGGGGGAGAGAGGGGAGCAGTTGTCAAGTTAGGAGGCAACAGGCTGAGGGACA 
GGGACCAGCAGACGGGGAGGGAGGGGCTGAAGCAGAAGTGTCCAGTGTCTGGAGGGATGGG 
GCCAGAAAGGCAAGGGGCATCCTGAAGAAGCTATACCTGGGGAGGGCAGCTCTCTCCCCACC 
TGCTCCCCAAITCATCAGCCAGGAATGCCCCATCCACCCCACCCCAGGGAGGAGGACAGAGGA 

30 CTTTCGTTTGGGAGCATTGAATGGTTCAGAGATTCTGCAACTCTGCGGTCCCCAACTAAACTG 

CGCTTGAAAGCCAAATACAAGAGGAGAGGTTTGGTGGGAGGAAAAGTGGTTTTAACTAGAG 
CCAGCAAACCAAGAAGATGGTGAATTGTTGTTTTAAAGCATTCAATTATCTCAAATTTTAAAA 
T-C TATCATAGGATTCTGAAAGGAAAACTTGGTATGGGACATACGTGGGAGCAGTGCAGGGTA 
35 CAGGGTCTATGTGTCTTGATCCAATGGCTGTCTTGAGTATCACCTATCCTGAGGTCTGGTTGG 
TOTO\3£33TCC33£GGCCAGA3^ 
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ATTCCGCAGGGGTTCCGTGTCTGGTTTTTGTTTCAAGATTAGCCCCTGGAATTCCCAAATAAG 
CATAGA GTT AGATAAGCGGG C A TGGT GCAAAGGAGTGTCTAGTGGGAAAGGGAGAGAAGCA 
GAGTTTCAAAGTACATTTCAAGGTTACATTTTAAGACTAAAGAAAAAGCCTTAAAATGCATT 
T TTAAAGCTGATTTAATGCTTGGCTACACTAGGCTGTGGCCAGTGTGCAGTGTGGCTGCTCTT 

ACCTCTGTGTGAGGGACTACCTTGGCCCCTGTCCTTAGCAGGAAGCTATGGTAAGGAACCCTT 
AGGGAGACA'rrAAAll'GGGGAGACCGTCCCTGCCAATCCllTAACCTCCCCAGCCTCAGCGAC 
CTCAGTTGGAAAGTGGTGGTAATAATACTACCACTGACCAGGTGTGGTGGCCAGACATTCCAC 
ACTTTGGCTTCAGCCGGTCCCTCCCCACTCTACTGTAATCCCAGCACTTTGGGAGGAAGAGGTA 

10 GGCGGAACCTGAGGTCTGGAGTTGAGACCAGCCTGGTCAACATGGTGAAACCCCATATCTACT 
AAAAAGAAAGTACAAAAAATTAGCCAGGTGCAGTGGCAC A CGTGTGTGGTCCCAGCTACTCG 
T G GGTCTGAGGC ATGAAAArrGTlTGAGCCTGGGAGGCAGAGGTrCCATI^GAGTGGAGATCG 
AG CCACTGCA CTCCAGCCTGGGTGATAGAACGAGATTCTGTCTCAAAAAAATAAAAATAAAA 
TAATAATA A T A A TACCACTGCCTGCCACACTAAGATTGTCTGATTAGA TG A C AGAATGAATGC 

15 AAA AGT ACTTTGTG A ATC AT A A ATGTTTTC ATC A AT ATT AGTT AT A ATG AC AATTGCTCCTTC 
TCCTAATAAATGTATTGCCTTTCTTTAGGAATAAATATAACAAGAAATGTGTAAGATATATAT 
GAGAAAAATAATAAAATTCACCTGAAGGACATAAAAGAAGACCAAAATAAATGAAACAACA 
CATACTTCTAGATGAGAAAACTCAATATTATAAAGAGGTTAGTTCTCTAAAATGAATCCCTAA 
ACCCAC A AAGTC A ATGT AT TT C C A ATGAAATT GT CAACAGC ATTA TTT T CC GAA GTG GGATGA 

20 GTAGTGCTAAGATTrATAAGAAAGCCAACATTCCAGAGCAGTGGGGAAGGGATTGCTTCACC 
ACCAAATAGCCATATTAGAGATTCCCTTGCACCATACCCAAACCACCATCTCCCAGGACCCGG 
G AG AGCAG A AA AGAGG A ATGAG AAG AA AGGCGAGGATGTGAGGTGTGCCCTC ATA ATGGCG 
GTGCACGCAGCACAAGCAATTGCAGAAAGACTAAAGTACTGAACAAATAGAAAACTTGGAA 
AAATATTAGAAGGAAA TG TGGGAGAACATTTTTGCAATTTGGGGATTGGAAACGGTTTTCTT 

25 A AC A AG AT AT A AAA ACCCC AAA AC A AG A A A AC A A AGGTTG A A ATTC ATA A AA ACTAG ATAC 
TTCTG T ATGATGAAAGACACGATTAATCAAGTTGTTAAGTTTAGCAATAGACTAGGGGAGAT 
ATCATAGTATATTTAACAGACAAAGGATTAATAGATACTACAGATGAAATATAAAATAGTTT 
CTCCAAGTCCATAGGCAGAAGATAATCCAATAGCAACATAGTTAAGTAATGTAAACAAATCA 
TCCTrAGAAGAAGAAATGCAATCACCAAGAAACACATGAAAAGG T GTCCAGCA'nTTGCAAT 

30 TCAAGCAACAATGAGGTGACAGATCGGCAAAAAACTCATAAAGATTTATCATCTGAAGGATT 
GG CCA A G AT AA AGCCAAACTTCTCGTGTTGGCAGAA GA AAC T GGTGA A G C C ATG TGAA G AG G 
CCACGTGGTCCTGCCTACCAAGATGTAAAATGTGTACAGCATTTGAACTAGCAATTCAGCCTC 
CA^AGCCATCCAGAAGAAACACTGACACACACTTAGACTCCGGTGAAATTCAAGGACTTCT 
GC CACAGCCTGCTTCGTAATAGTGAAAATCTGAAACTGCCTCAATGACCGTCAATAGGAAGTr 

35 GATTTTAAAGTGTTACAGCACATCTGTCTGGAGAGATCGCACTGGCCACTCCTCCTCACCCCCT 
CTGCTGGACCTCTGAGCGTAGGTGGCCTGGAGCTGGGTCCTGAGCCCTCTTTGGTCTATACCG 
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ACACTACCCAATATGGTAGCCACCAGTCACGCTGGACACTTGAAAAG TGGCCG ATCCTGACTG 
AGAAGGGCCACGAGTGGGAAAAACACACCAGACCTCAGTGACTT A GG C A G AA GT ATGTT T T -G 
TrCCAGACTATTGACTGAGCCCGCAGCTGAGTTGGCTCCAGCACCCTGGCCCCCTGCTCCATCC 
ACTCACTGGGACTCCCCACTGCACAGGGCAACCTCTCCAGGGGCACTTGGGCTGCGAAGGGGA 
5 GA GTGGGTG GC ATC C CAGGCTGAAGCITCCTGAGCAGGGCCAGAGGAGGAGC C AG'rCCCTGT 
GGGCCTCTGTTCTGACAGTGTCAACCTCAGCCAGGCTTGTGTGGGCCAGGTGTACTGTTCTGG 
nCAGATO^V A G G A GATA GTC ^ G G GCAGGCXGCGC CAAAGCCCTCCGATGGGCTCCCCT 
GCCTGGCAGACCTGTCCAGCTTTGGACTCTGGCCCTGCGACCTGGAAGTCAGGCTGCCAAGAG 
GTCCAGGC AGTGG CC TCCAC T G TG GAGGGTCTCTC 

10 TrAGGGATGTGAGATGGTCCCAGGGGCCTGCTCCTGAGCCACGCCAAGCTGCCTGCTCCCTTT 
CCTCTGCTTCCAGACTCACGGGATCCTCTGCTCATCAGAACAGGAGTGTGGGAGACCCTGAGA 
CACTGCCCCAGGATCTGAACAGGTGGCAAAGGCTrAACAGGCTAGCGGTCACTGTAGTGACA 
AGGCGATTGAGTGGTCACCATGGTGATGGGGATGGAGGCTCTTTGCCACCAGTCCCAGTTTTA 
-F GCATGGCAGCTCTAATGACAGGATGGTCAGCCCTGCTGAGGCCACTCCTGGTCACCATGACA 

15 ACCACAGGCCCTCTCAGGAGCACAGTAAGCCCTGGCAGGAGAATCCCCCACTCCACACCTGGC 
TGGAGCAGGAAATG CC GAGC GGCG C CT GAGCCCCAGGGAAGCAGGC T AGGATG TG AGA G A C 
ACAGTCACCTGCAGCCTAATrACTCAAAAGCTGTCCCCAGGTCACAGAAGGGAGAGGACA'rrr 
CCCACTGAATCTGTCTGAAGGACACTAAGCCCCACAGCTCAACACAACCAGGAGAGAAAGCG 
CTOAGGAeCCCAGC^ 

20 AAGGGTCCAGAAGGGAATGCTTGCC G AC T CGTTGGAGAACAATGAAAAGGAGGAAACTGTG 
AC'tGAACCTGA-AACCCCAAACCAGCCC G AGGAGAA CCACAT T CTC 

GAGGGGTCCCAG CTCCCCACGCTGGCTGCTGTGCAGAT GCTGGAC GACAGAGCCAGGATGGA 
GG CCGCC A A G A AGGAGAAGGTATCTCGCCCTCCATTGGGCATrCTGGGAGTGllTGClTGCCT 

25 GTCCCCAACATTCCATGGTTTGTTrGAGCCTCAGAATCTGATTTTATGCACAGGCTCTTTGAGA 
AGGGTCTTGCCAGGGGTGCCTTCTGGGGCAGGAAGGCCCCTACTGCCTGGCAGACCCATCCAG 
CTTTGGACTCTGGTCCTGCGACCCGGAAGTCAGGCTGCCAAGAGGTCCAGGCAGTGGCCTCCA 
G TGGGGAG G ^tGCTCTGGAGAGTTTAGAGCCCTAGATGTGGGGGTTAGGGACATGAGGTCTTG 
1 'G G AGAAAGC CCACTACCTGATrrrGAGACAACACTCACTAGACATGGTGACAAGTCAAAGA 

30 TGCCTTGCCTCCTACCAGGAATCACTTCGCAGGGAGCCCGAGGGCTGCTGTGGCCTGCTGAGG 
AG3^:AGGGCAGWAGl^rari^GAAAA-ACAAAGAGAA-A3« 

GAGCCCAGCAGTGAGCAAGGAGAGAGCTGGAGACAGGGGACTTTGCTGTGAAACACTGGGG 
GGAATGTGCCTGCATCACCCCAGCTGGGGGCCCAGGCAGAGTGGGGGAGAAGGGGTAAGTGG 
G CAGAGCCAGTCAC^rrTGG G CATGCTTCCCTCTCGCCTCTGTGTGAAATGACCAGGTCAGCAT 
35 AAACCCCGGG C TGGCTGTGCTTCTGGCAGAGCTAATGATGTTAGGAGGAAAACAACCAACCC 
AAGTCAGAG G GTGCGCAfiCC- AGACAGC TGGACCGGCCG AG GC C C CAACCAAGTCCCAGATCT 
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GCCTGTCACTGGTGCTATGGCAGCAATTTGGATGAGAAATCCTGCCCAAAGGGCCCCTTCAGG 
C^ AC C CGGG G AGAAGGAAG CGGCTGTCTTTGGCA TGACCAGAA AGATGGC T C G G A G CTA GGG 
AGAGGTGGACATGTGGGCTGTGGAGATCTGGCACTTTCCCCAAACAAGGAGAGAAAGCATAG 

5 TCCATGTGCACATATGTGTGCACAGGGAATCACTTTAATAAAGGCCACAGCAGAGCTGTCCCT 
G AGCCCCTTGC ATTCACAGTGGCATGTGAG TGA AC C ACCTTCTTAGGCTGGGCATCCAGTCTC 
A GA CTGT GGGGCTGCCCATG CCCCATCC TTTA T CTGC TCC A CGTGTGAGGGGTTGCTGGTCCTG 
ACCAGGGCCAGCTGTGAACCCCAGAATCCTGGGAAGTCACTGACATTCTTGTCAGGGCCAAGA 
G TGGAGC AAGG C AATGCCTCGGGCACAAAC TTT AA GG GGTCACCAGAAACATCAATCATCAA 

10 GATATATGCTATTTTAAATAATCAAA ATGAATG CAAAAAAAATTTATGATGGACAACATACC 
A A ATTCT A A AC A AAGGC AGG ATG AGT ATC ACTGGCTTCTGC ACTTTTCTCC ACCC AGTCT ACC 
CCTCrrCTAG r rGCCTGGATCGCAGGGTGCCAAGGCCTGGATGAGGGAAGCGTGGAGCTGCAA 
TGGCCACTCCTGTCTGCCTGTTCTGGCTGCACAGAGGACTCAGTCCTTGTCTTGGGGGAACCTA 
TCTTGGTTrrAGGGTCATCCTAAGGATCTGATGTTTTCCAAGTGAGCTGGCTGTCCAGGCCCA 

15 CCCAGGTTCAGTCCAGTCCTGTGTCTCTGGGAAGTGCTGCCCCTACCCCAAGCCAGTGTTTGAC 

CT-FGGAGCAATOAGG AATGCCCTCCTTC^ 
TTGTCTCTGTCCAGACACCGAGGAGGAA 

AGTAAAGTGTTGCTGGTTTCCCTCTTTCTACTTCTTTGATTGAGAGCAGCCGTCTTGCCGGTAC 
CAACC TI XXAGAT CTTA C CTG TGGTTGCAGGAGCCT 

20 CTCCATGTGGATGTCAGCTCCTTAGGGGCAAGCCTGATTCCACTGACACTACTCCCACCCCTCA 
TAAGCCCCTO:TTACCAGCTGCAG1TGCCTGGTACCCCACCATCGCTGACTCATTCCTTTGGCA 
TCAAGGTTCATCCCTTACTGGGCCACCACTTCTGGGTGGCCTGAAATAGGGCCCTGGGCATCC 
CTCTTGGGGACCTTTTGGTCTATATTTTCACTCTCACCTCACTAAGGACAGATGAGTAAATCTG 
GTTAACTTTGCCTGATAGATTTGGTGACCTTTTTTCAGGAAGGAGCCTCK3AAAGA 

25 AGGTGT ATTGGTC AGCTTAG ACTGCC AT A AG AG A ATACC ATCC ACTG ATGGCTTAG A A AC A A 
C^ rAAATCTA 3 ^CTCACTATTCTAGAGGCTGGACGTCCAAGATCAGATGCCAGCATGGTCAG 
GTTGCAGGGAGGGCTCTCTTCCTGACTTGCAGACCGCCACCTTCTTGCTGTGTCCTCACATCGT 
GGAGACtAGAGTGAAAACAAGCTCTCTGGTGTCTCTTCTTATAAGAATGCTAATCCTATGATG 
GGGGCTOCCCTCCTTACCTCATCm^ 

30 ATCACACTGGGGTTAGGGCTTTGAC AT A TGA A T CTGGGGGGACACAATTCAATCTGTAACACC 
A GGAGGG C ATGCCG G GAGGAACTGACCTrCCTCC CTC C AGCTGCC CT G GACACCTT TO 
TTGAAGGAGCAGGCTCAGAAGTGGAATGAGGATGGAATAAGGTGCACTCCATCATGCTTACC 
CACATGCCTGGCAGG A ATTGTCCTGGGCC CCAGCA G GAG A 

CACCTGCTCTGAGACAG GTGT GCAGAGTGC AA AG CTC CAGGTGGCCCCCAAGCAGGTGT G CT G 
35 GGAGGAGGGG C CCG T GTGGGAGGAGCAGGCAGCGCCAAGGCCTAGCGGAGCAGTGACAGGT 
CC C TGACITC A GGGAATGGG CA CG C TGTGG G C AGGCAGCTGGTGTGGGGGTGAG GGCTGGGG 
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CTGCATCTGTGGGACCAGGGCTGGGCCATCCTCATATGCCGTGTCACAACCCCAGTGCCCCTGC 
TGTA GC CAG Q ACAGGA G G CT GG GCCAGGCTC 

AGAAGCACTCTGC TGCCTGTGCCCAGG C C TCTGGGGATGAGGACCCCTCAGAAGGAGTAGC T A 
TGT CT AGGAAGC C CCAG GG CAGGAGCAAGCCAAAGGGGACATCATTAGTGAGATCCAGGGGA 
5 TCA GTGGGCC ACAGAA GCCCCAGCGTGAGCCCCTCTGACTGATGCAGCT AG GC CC ACACCTGC 
ACCTGCCCACAGCAAGACCCCCAGGAGGAGAGGGGACAGATGGAGAGAGGCACAAAGTGCCC 
CTGGCCTCTGCC TrGAAGCCACCCCAAGGCAAGAGAGATtTGAGCCCCTGlTrAGTGACCTCC 
AGGGGAACATTCTGGCCCATCTGATGTGGGAAGCCCCTTGTGGAGTCTGTCATTCCTCAGCTG 
AGCCAGGCCTTTG GAGGCAGCCCAGGCATGTCCCCTGTGTGCTCCTATCCCTGTGTTGGGACA 
10 CCTGGCCCAGCCCCTCCTTCTGCCTTTCTCTTCCCTTCCCTTCTCAGGAGTGGACACTTCCTCCT 
TTAGCCCCCTCACAGCTGTGTGAACTTCTCTGTATCTCTCTCTTTCTGTCTCTTTCTCCCCCTCTC 

CTCTCTC TCT C T CTGTG T C T ACCTTTCTGTATTTCGCTTTGTTTCTTTTTC T CT GT G T G T GTGTG T 

15 GTGTGTGTGTGTGTGTATATCTGnTrTCTCACTCTCTCAATCTCTCTCTCTCTTTCTGTCTCTC 
T T TTGCTGG C C TG AGCAAAGA G GGAGCCCCATCCTGAT GCTACATAACCG 
TGAACCAGC A CAGACAGAATTGTAGGAAAGTCCTGCAAGTAGAAGGATAGAAGGATGAGGG 
AAGAAACGCCATGTGAGTCATGACAGATCCCTTTCCAGGAGCCACTGACTCACCCTGCCTCCT 

GCGC-KXC^G'I&l-'G^^ 

20 TGTGGTTCCCACAGTGTGGCTCCCTGGGT CTG GCCTCAGGCTCCACAGGTGCCCAGCCCTGCCA 
AAG T CTCCAGAGCAGCTGTCCAGCTGGGGAGCTGCGGGGCCCCTTCACAGAGC GCA TGGGAA 
GAAGTTC C AT CC TACACATTACATCGAGAG GG AC GT GCCTGAGAAGGGGAGCTGGAGCCCGT 
GCAGCCCCCTGCTT GC GTG C AGAACATAGTGTACCCTGAGCATGCCATGAAAAACACAAACGC 
ACAAAGTTGT A A A GAAAAA AGAAA T GACAGGTGGCTGTAAAATCAGTTATAGCCCACGAGA 

25 GGCCCACTAATGAGTGGTGATTTCAGCTGATTACAAAG A AA T GATGGTGTTTCTGTAATGAA 
CTAAACATGCACTC G TGC GTGCA CACAC G CG CACGTATA GTCACATAACTGACCAGCCCTATG 
CATCACTTGTTAATTACTTAGTAACTGTAACAATAATAGTTTCCAATAAGTGAGCCTTAGTCT 
CTGCGCAAGGGTCAGTTTATTGAGCACACGGGGGC C TTGCAGTGGGGGCAGGTGATCTGCTCC 
TGGGAGCCGCCAGCCTCTCCTCTCGTGCTCTrCATCTrCCTCCGTGGTGGGAAATrGTCTCACT 

30 GCTTCTACACCTGAGGCTGAACATCTCCCTTTATTTCAGTCTGAAACACATGTAAAAATATAC 
TGGAATG A ATTAAGGTTGCAATTATrGATATCAGGCAGTGAGTACATCA G GG T ITATT AT^ 
T ATCTCCTTT ACTT AC1TCG A AGTTCTC T ATT ACC AAA AA ATT AA AA ACT AT AAA AG A A AG A A 

C CCCTA ACAC AA GCACTTGTGAC CT GGAGTGATAT T CACAGCATTCCTTACCTGGCAATACCT 
35 GAGTATTAGCCCCCCCAGTGGGATCTTTGTTGTAGACAACCAGCAACTATCAGCCCAGCCAAT 
A AACAAGTAGGAAAGGGGAGTGC r GGAGAGGCCAAGAAGTGGGATrrrCCATGCTCCTGGGC 
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TGTGA T CCAGAGGGCACGGCTGTGAGGCTGATCTCAATGAACACTCTGTCTTGGAAGTACAG 
GGATCCTCTGCTACCTGAAAACGrrCTGAGTATTCACTTTCATGGATTGCAAA GTC A TTT A CC^ 
AAAATTCACTCTCCAAATGAAAAGTGAGTATGATGAATCAGTATTCAAGTTCCACCTGGGTCC 
TGGGAGAGGGCATGGACATCATATCCCA GCT G TTC CGACAGGAGGACCCAATCTGAGTCTCAC 
5 TGCCTGCCTGCATCGTTTGTCTGCTGCCAGCCTGCACAGTAGGAAGGGAAAACATGATTTGTA 
TCTGTTTTAGGTCAGGTTCCC AAGAA GTAGAGCCTGAGATTGGAATTCTTGGAAAATGGTGTT 
I GCGGGAGCGCTGTCAGCAGAAGCTATAAGGAAGTTGGGGGGACAGAAAACGAGAGGTAAG 
A AGCC AGTC AAA A AGGC AGGTCC AGCTTA AGTCCGCCTC AGTCTGGTTCC AC A AGGGCTCTG A 
TG CATGAAGAATA TC A C AG G G T TG T CCC ;T CCTGGGAGAGGGGC CAGCC T AT TG T A CCTGTAIC 

10 AAAGCCACCAGCTGAGGGC C AGTGGGGAGGGAAGATCTTCCAGGCATTTCCAGGAAACTCTC 
AGGAGAAGGGTGTAGCTGTGAGCAGTCTGCAGCTGCTGCTCACTGCGGC TA AAGGCTGGGTG 
TGCAGGCCAGTCAGCCAGTGAGGTGCCAACAGCAGGCACTACAGTCCAC CC C TT GACTGCTCA 
GACCTACTGCTTTCCACTTTAAGCTCTCTCCATCCAGGCACAGCTTCAGGGAAAACTTACAATT 
GGAGAAACAGAGGGATGA4CTACAATGCCCACTTCTGCATGTGATTGTAAGACTGTCACTGA 

15 TACTCACCATCATGCCCCATCCCCACCATCCATTCTAGTGTCCCCTTCCCCTTGGCTAACACTGC 
TGGTCTAGGTGACTTC CCT AGAGC/\ G GAGCCAAACCCTTA TCCCTG A G GC A TC TGA A T CCTGG 
ATTCCTTTATCAGGCTATTGTTGTTGTAAGTTGTCCATTCCCAATTACAACTGGACATGAGACT 
ACCAAGAAACACCCTGGCAAATCATCTGAGTGCAAGCCATATTCTTCCTGCTCCATTATGTAG 
CGGTAGTCCT ACCrCCrAATGACAA G GG T AAATTGCCACATlTrGCTCCTTGTGCCAGGATGG 

20 TAATACCTT T CTC T A CC TGCTTGGCTACTGGCACAAGGAAGCACAGCATGACCAGGAGGCAAT 
TGTAGCTGTACAT TTAGT GAA .TGT G TTi \A TGT AT CA CCTGGTGGAAGGACCCCC T CTGAGAAC 

GATCATGGAGAGTGATGAAGAGTTCTTGGTTCCCAAACCCACATATTTTACCTTTCAGGAACA 
TGG C CTCATCCCATAGCCATTAGAG T GCATATTGCATTCTGGAGGAGACTGGGCCCTCCTCAT 

25 GGGTGTCATCTTCAAGATGACAGCTCCACTGTGCC TCCA AG AG GATGCTCCACCACCCTATCT 
GTG AT TC C TTGG TTAGCAGGACAGGCTGCTGCACTGAGGGTAGGAAAGGCAAGTCCATTGAT 
GGCTGGAATACAT GTC AA TC CAAGTCAAGAGAAAATGCCGCCCTTTCCAGGTTGGAAGGGGC 
CCGATTTAGCCAACTTGTCACCCAGTAGTGGCTGGTTGGTCTCCTCCAGGAGCAGTGTTATAC 
CAG G AA TT CA GC A CCAGTC GCTATTGCTG G CAGTTCTTACATTCA ACA GCAGCAAAACTAGGT 

30 CAGCCTTGATGAGAGGGAATGTATGCT T CTGGGCACAGGCATGGCTTCCTTCTCTGACTCCAT 

T ATTATTGCCTCTTCC AC A AG AAGTGGTTTCTGGC T GTC ATTA ATGTCTC ATACTTTGTGCCC A 




35 CCTTAGGCC AC ATCTCTCTCC AC AC A AAGTGA AT AAGC AGGTGC ACCCTCC AAA ACTCTACTA 
AGAGGA-n^g^rCTC C CC ^GTGTCTrTCAGG G C CACC'r TGAGTGGGGCTGAAGTACAGCA G AA 
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GTCCATTTCCAGCTTGCATCAACATTCCAAACTAACCTATCCATGATCAATGCATAGATGGGT 
TTTTCCCTCCTCCAGCAGCTAGACAAAAGACACCCCCCACCAGGAGGCCATATTTGCATGTGG 
GTGAAAGAGAGGCACAGGGGCCAATATTCGTGCAACAGTGGTAGATGGCAGGTGGGTCTGG 
GCCAeCTOTOXTGC AGC naKTg^^ 
5 AmCCATCTTATGATGGATCKX^TTATGACCTAGTGGGTCTGACAGCACCAAACTCATAATGG 
GCAGTTATGGCCACATGGTCACTTAATGTCCTATGGTCAGACACTCTGCTGAGTGGCATGCCA 
GGAAATGCTTTACAAGTGGTG-TTrGGTTCTCTGCTGCAGATGGCATGACCTTGGTCCGGAGCC 
CT AGGGGTTTGG AC A GTG ACTCCTGTTGG GGCCT A ATCTC AC ATTCC ATGC AG AGT ATC ATC A 
GATTTGCCAATCACATAGCCTAAGCtGTCAGGACTGATCCAACCAGTTTTTGCAGAGATCAAA C 

10 TGGAGAATGAAAGGTTGATATGATGTGACCATCATATCACGTTTTTCTCTCTTGAAAAGTATG 
CAGATGTCTGAAAGAGACAAGTGCCCCAGGAGAAAATGCATGCCTTCCTCAGGATCGGCCCCC 
ACCTCCCCTCCTGGCCACAAGGAGGGTCAAATCTCAGCATGGCCCAAC r rTGGACCTGTCAAGG 
AAGAAGAAAAAAATTGTATGCCAAAGGAACTCAGTCTTTGGCTAACAAGTACTAGACATCCT 
CT AAGTCTTTGAGAATGGTAATAATTTCTGCCATCCCTCCAGATTTGTGTTTTTCTGTTTTGGC 

15 TGGGTGGGAATGCAGCATTTTCACTTTGCCTTTGTTATTACAAATGTTGCTTATTCTATAAATC 

TCCAAGGAAATGATTTCTGGGTGAATACACAGAACTAGTGGATCCAATTTGAGACATACCTG 
GGCCAGAACTATATTTGTCGTCTTACCCCAATAAGCCTGCACTCTACTAGGACAGCCATGACA 
GCACTTTGGGACCCTAGATATAAGTGTG AATTGCTGGCT GGGC AT GG T GGC TC ACGCCTGTAA 

20 TCCCAGCATrrTGGGAGGCTGAGGCAGGTAGATCACCTGAGGTCAGGAGTTGAAGACCAGCC 
TOGCCAACACGGTGAAACCCCATCTCTACTAAAAAATACAAAAATTAGCTGGGCGTGGTGGT 
GGGTGCCTGTAATCCCAGCTACTCGGGAGGCTGAGGCAGGGAGAATTGCTTGAACCCAGGAG 
GCGGAGGTTGCAGTGAGCCAAAATCACACCACTGCACTCCAGCCTGGGTGACAGAGCGAGAT 
TCCATCTCAAAAAAAGAAAAAAAAAAGTGTGAATTGCTATGAAATCACTATCAAAAGATCTG 

25 AGTGTTACCCTTACTCAGTGTGGTCGAATATAAATAGCCATAGGTTCCTGTTATACACACTTG 
C TCTGGTGCTACAGAGTCTTTCCTCATGGGAACCCAGTCCCTCTTTCAGTCAATGGGTTCTGGT 
TCGAGAACTGGCTGAGGTTTGGAAACTGTGCCTTTCCATCATAACTTTCCACTGGGGTGACTG 
ACCTTGGCCrrCTGTTCATCCTTTCTAGCCCCTAAGAATCCAACACTCTATTAGCCTTC 
GACCCCTATAAGCTAATCCCTTCrAGTTGTTAGTCTGACCTTGGTGCCCAATATGATAATTATT 

30 CCCACTTTGCTTCTGATATGCTTCTAAGTGCTGCCCCTGGTCTCTGCCCTTAAGTGATCTATCA 
TC CC C A C T GC CA T TAG G GG GAGA AG CTCT GAAAAAGAGTTGTCTCCCATCAAC TCT G GT C TAC 
AAAGGACACCCCTACTGAGCCTCAGCCATGTGCCCGACACCAGCAGATTCTTTACAGCCTGGG 

AGAGTAGGTCAGTTATATTTTGCCCCATTTCTTTTATCCTTTTGATCACTTCCTCTTGGCCCACC 
35 ATGTA AACTC A AGC ATCCCTGCTTC ATTT AATCGAGCTGTTGCTTTTTCTAAGCT ACC A AGAGC 
AACX^CCAGCAATATATCAGAGCCCTCTCrrGGGACCCTrGCTAGGGTG'ITAAATCCTGCATCA 
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T A GG AG A ATGCCCCC AC ATC AGC A A AGTCCCCTT ATCCTCTTG AT ATCCC ACCTGCCCC AGTCC 
AGCACCFK^AGQATC^ 

GATCCAGCACTTlTGGGATCTAGTCTCCCACTTCCTGCTAGTACrrGTrAGCCAAAGACTGAGT 
ICCTCTGGCA T AC AAT TTT TT T C TTCTT CC T^ 
5 ACCCTC C TrG TOCiCCAGGAGGGGA GGTAGGGGCCGATCCTGAGGAAGGCACTCATTTTCTCCT 
GGGGCACTTGTCTC TTTC AGACATCTGCATACTTTTCAAGAGAGAAAAGGCCTCCTTCTCACA 
GC A AGACTACTTCTGT A G A TGCAGGTGGCTCGTGGGAATCTGGC AATTCAAAATTCTCAAGTG 
TACTCACTAGCA CATTAGA AAACCAGTAGTA C A C ATCTCTTTCCAAATCTTCATTCAGTGACA 

10 C TACAAACC AG GGCAT CCCGCATCTCTAGACAGCAGATTGTTGGCCATTTCCCAGCATACCAT 
TGTGTATACTCCTTCCCATCAGGGCCGTGGCTTGCCTTGGTGGAGGACTCAGCCCTTGCTGAA 
G T rC TGC3^CTCCTCTTACAATrG A GTCCT A TGCCT GG TCT C CAG CTCTGCCTGCCTCACT AC A 
GGAGACAAGCATCTCTTTGAACACTGCCGAGAAGACCCTCTGGCTCTCAGGCTTGGCTTTAAA 
3CGA^aA€Cm'M3CCP3CCATO^C^ 

15 GTGGCATAGTCCITCGGGTI^AGCATCTCCCCCACACCCTCGGTGCCAGAGACACTGAGTAAGA 
AAGTA(XTC€CTCT4 ^C CCC C AT ^^ 

G AATGTGCCAGAGACTGTCAGTACTCCTACCACCAGTGAGGTGGCAACCAGCTGGGAAGTGA 
TCCAACTCCAGAGTCCCGCCCTCATAGGCTGATTTCTAGGACCACCCCTGGTATACTGTGTTAG 

20 CAGGATGAGATGGGGTAGAAATAGGCAAAGGTACAGATTCAGCAGCAGTTGAGCCTCAGTCT 
GA CCCAGCAGGGAGCTCTCAAATGTGAATGACATCACAGAGTTGTCCCTCTGAGGCAGGGGC 
CA gCCTTTGTGCTCCTACATGAGTCAGTCACTGGCTGGAGGCCCCTGGGGAAAGGCTAGGGCT 
GCCAGCTTTAGCAAATAAAAAATTAGGGCACTCAGTTAAATTGAATTTCAGATAAACAACA 



25 The genomic DNA o*-for Y-KT4-SNARE YKT6 gene is 39,000 base pairs in length and 

contains seven exons (see Table-14 below for location of exons). As will be discussed in further detail 
below, the ¥KT-6-SNARE YKT6 gene is situated in genomic clone AC006454 at nucleotides 36,001- 
75,000. 

The human Uvep-glucokinase is depicted in SEQ ID NO:2j- 

30 MPRPR S Q L P QPNSQVEQIL AEFQLQEE DLKKVMRRMQKEMDRGLRLETHEEASVKN1LPTYVRST 
PEGSEVGDFLSLDLGGTh n FR V MLVKVGEGBEGQWSVKTKHQTYSIPEDAMTGTAE lV lLFD Y I SE C I 
SDET^DKHQMKHKKLPLGFTFSFPWHEPIDKGlLLNWTKGFKASGAEGNNVVGLLRDArKRRGDF 
EMDVVAMVNDTV A T M IS CYYED H QCEVGMIV GTGCN A CY MEEMQNVELV 
W GAFGD SGELD EFLLEYDRLVPES S AN PG Q QL YEKLIGGKYM GE LVR LV LLRL V DENLLFHGE A 

35 SEQLRTRGAFETRF^SQVE SDTGD. RXQlYNILSTLGLRPSTrDCDIVRRACESVSTRAAHM'CSAGL 
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AGV.INRM.RESRSEDVMRI.TVGVDGSVYKLHPSFKERFHASVRRLTPSCEITnESEEGSGRGAALVS 
A V AC KK ACMLGQ 

and is encoded by the genomic DNA sequence shown in SEQ ID NO:6.4- 

5 ACTAGCACA1TAGAAAACCAGTAGTACACATCTCTTTCCAAATCTTCATTCAGTGACACTATG 
TCAGTAGCTGGAAATGGGCCATGGTGGGTGTATTTAAACCATGAAAATCAGAAAATGCTACA 
AACCAGGGCATCCCGCATCTCTAGA 

AGCAGATTGTTGGCCATTTCCCAGCATACCATTGTGTATACTCCTTCCCATCAGGGCCGTGGCT 

TC*CgTCG-TGC^^^ 

10 CCTGGTCTCCAGC r rCTGCCTGCCTCACTACAGGAGACAAGCATCTCTTTGAACACTGCCGAGA 
AGACCCTC TGGCTCTCAGGCTTGGCTTTAAATCGATAGACCTGAGCCTGCCATTTTCT 
CTTTTCCATGCATCACTCCACTGATCCACAGGTCTCAGTGGCATAGTCCTTCCKjGTTAGCATCT 
CCCCCACACCCTCGGTGCCAGAGACACTGAGTAAGAAAGTACCTCCCTGTCTACCCCCATCCCC 
GCTCCCCACAGGCAGGGCCTTGGCGATCCACTGCTGCAATGTGCCAGAGACTGTCAGTACTCC 

15 TACCACCAGTGAGGTGGCAACCAGCTGGGAAGTGATCCAACTCCAGAGTCCCGCCCTCATAGG 
CTGA TTTCTA(jGACCACCCCTGGTATACTGTGTTAGGTTCTTGAAGCAGAGCCTGAGA T AAG G 

a1tctggcacctgtgattgagtgggagggtgctctcaggatgagatggggtagaaa t aggca 
aaggtacagattcagcagcagttgagcctcagtctgacccagcagggagctctcaaatgtga 
atg a catc a cagagtrgtccctctgaggc a ggg g ccagcctt tg t gctcc t acatgagtcagt 
20 cactggctggaggcccctggggaaaggctagggctgccagctttagcaaataaaaaattagg 

ggtgaagtaact ttgtagat caggagtaagtcccaggaaagaagtccagctcttctct 
i ca gccctg ggcag ct gg g ggtaggcac aggg gcccagcaggcacccata 

25 gcatctcctacagcatctgaaatgaacagggtcatcacgtactacatacaaatgtacccactg 
crgagttcttcagggattatatcattaggtacttggtattttaaatacattacattatgcaga 
agtcctttgtggattgctatatttggagagttttgtgatattggggggattagatggagtttt 
cagatgggcatcatacggtttttcatttaaaaccctagagtattgtaatcctagggagtga 
tccrx3cgatragtaaattagctctccaatagattttcaatgtggttgcaaaggacatgcatgt 

30 ggttcaccctcccaggaaatccagaagggcagcattggcctgagtggcctgagtttggctggt 

TGGGGT-GGlAATGCreGAGAAAGA 

CAATGGGTGGAATGGTTTGCTTCCCTCAGTCCTTTCAGACACAGCCCAGC 
CCACCACGTCAAGCCAGTGGGTGCATCTGCAACCAATCCCCATGAGAACT 

35 ATGTTGAGGAAATCTTTGGATTGGAGGCAGGCAGAGCAGGGAAGCATC GG 
GTG A TTCT A T G A C AG A C CC AGGGCT C C A AG C TGC AG TTC AGG AGGGG C AC 
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TGGCACGGCCTCTGCTCAACTCCCCCTTGAGTGACATCAGGTGAAGTGCC 
G AC A ACACAGAAGGCA GCAAATG C TG CCA GTC A GG TCTGCTTCCCAGGAC 
AGCCAGTTGCTAACCCTTCTCCAGCACAGCACTGGATTTTGGTCACCTGG 
CK jGGAGC TC C AC C TC C CC AGCTGCTGCC TC A CCT GCTITTCCAAACCCC 
5 ACCCTGTAAACGGTAACrACATTTTGTGCCCACTACGCCTCGTTTCCATC 
T C TTTGG AGC A C CTCTC ACGTGG AGCTG A AC AG A ACGACCTGTTAAGCCC 
ACCGTGTCTGTTAGGGTTGTCTAGGCTGTATCAGATACCCAACTAAAACT 
GG ATTC A C C A AC AGGTATTG T C A AAG C A C ATA AGA A AGAGTCC AG AGGCA 
GGCAGCTCTCAG CC TGG T GTCAGGCTC T GGGTCAGCTTTCCAGATTCTCT - 

10 TAACCTTCCCCACATCTGCCAGATGCCGCCACAGGCACAGGAGGTACAAA 
C A A ACCC AA AA ATGTTCTGG AA AC A AG A AGGG A AGGGG ATCCCC ACC ATA 
TCTCCCCAGAGGCCTrCCTT C TCACATCTCACTGTACTGAAGCCAGCTCT 
A GCAGAAGACAG CAGGGTGAATTTGTCCAGGGTATTCAG C CC CC AG TGCT 
G G G TCCATTACT ACTTGAC CCCTGA A T AAAACAGAGGTTCCATGAGCAAG 

15 AAGGAAGGGGAACTGGATGTTAGAGGGCAAGAATGTATCCATCCCACCCC 
TA GGAG CACGC ATGGACAACTGCCCCATTTTTGCTCCTATT GCAGCCCAG 
GGGCTAC iCC C AG AGACCTrGCCAGTGCTGAGTCACAAGATGCTGGGAAAG 
TGAGACCAGAGCCTGG T CTTGGGGAACAGCTCAAGGCCGCATTGGTCTGC 
AGGTCATAGAGCA G CTG CT GAGCAGTGAGAGCCCACGATGGGCCAGGCCC 

20 TGGGTCTT^GGAGACCTGAATGAGATAGACTGGGTTC C TGTTCTCCTGGGC 
A1TG CC TC TTAGAGGGCAAAGACAATTA ACA A T AAACAAATAGAACATGA 
AGTGTTTTCCGATAGTGACTGATATACTTTGGATATTTGTCCTCTCCAAA 
TCTCATGTTGAAATGTAATTCCTTATGTTGGAGGTGGGGCCTGGAAGGAG 
G^^TCTGGGTC A TGGG G GC A GATCCCTCAT GAATGG rr TAGTGCCATCCC 

25 CTTGGTGATGAGTGAGTTCACGTGAGAGCTGGTTGTTTGAAAGAG C CTG G 
CCCCCTC T CATTCTCCTGCTCCCACTCTTGCATGAGACACCTGCTCCCCC 
TTCTCCTTCTGCCATGATTTTAAGATTCC AGGG ACTTCACAAGA AGC AAA 
TGCTAACGCCATGCTTCTTGTTCTGTCTGCAAAA C TGTAAGCCAATTAAA 
CCTCTTTTCl^ 

30 CAAGAACAGCCTAATACAGTGATGCT CTCCA AGTGACCTTTGG G CT GAG A 
CCTGAAGAAGAAG G CKjAAG C AGT TACX jT CT G AT AGC T CA T GC C TGTAATC 
CCAGCTCTTTAGGAGGCTGAAGTGGGAGGACTGCTTGAGCCTAGGAGTTG 
AAGACC A GC TTG GAAAACATAGCAAGACCCTGGC T C T ACAAAAATATTTT 

35 GGCTGAGGCAGGAGCATCTCTTGAGCCCAGGAGGTTGAGACTGCAGTGAG 
TCATGTrCACACCACTGCACTCCAGClTGGGTGACAGAGCAAGACCTGTC 
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SUBSTITUTE SPECIFICATION-MARKED UP VERSION 

TCG AAA A AG A AG A A AG A AG AA A GTAGG A AG A AG A AG A AG A AG A AG A AG A A 
G A AG A AG A AG A AG A AG A AG A AG A AG A AG A AG A AG A AG A AG A AG AGG A AG A 
GGAACAAGAACAAGAAGAAGAACAAGAAGAACAAGAAGAAGAACAAGGAG 
AACAAGAAGAAGAATAAGAAGAAGAAGGAGAAGAAGAAGAAGGAGAGGAA 
5 GAAGAAGAAGAGGAAGAG G A G GAAGAGGAGGAGGAGGAAGATGAGGAGGA 
GGAAGCAGAAGCAGAAGAAAAAGAAAGAAAAGAAAGAAAGAGAAAGAAAG 
AAAAGGGAAGGAGGGAAGGAAGGAAGGAAGGAAAAAGGGAAGGAAAGGGA 
A GGAGAGGG A GAGGGAG AAGGAAGAACAAAGAAGx^AAGAAGGAGAAGCAG 
AGGCTTGTGCTGGATAGCCTTGCTTTTGCCAATGA CC TT GC TGA TT TT CA 

10 GGGGGTCCTGGTGTCTTAG T CC A TTTG TGT TGCTGTAAAGGCATACCTGA 
GGCTGGATAATTTACAGAGAAAAGAGGTTTATTTGGCTGAGAGTTCTGCA 
GGCTCTACAAGAAGCATGGCACCAATGCCTACTTCTGATGAGGGCCTCAG 
TC T G CT T CCA C T CATGGCAGAAGGTG A AG C AGAGCCTGCATGTGCAGATA 
TC A C AT GG T GAGAGAG GAAGCAC G AG GG GGC AGGG AG GT GCC A G C C TC TT 

15 CCTA AT AG T A AGCTGTCTTG AG A ACTA ATAG AGTA AGA A ATA ACTC AC AC 
CCT- GCCCCCA /^GGAAG G GCAT TAATCTATTCATGAAGTATC T GCCCC C AT 
GACCCAAACATCTCCCA TT AGGCCCCCCACCTCCAACA1TGAGGATCAAA 
TTTC A AC ATG AGGTTCCGGTGGGC A A AC ATCC AGCT AT A ATACTGGGC A A 
IGCTGACC^GACTCT^^^ 

20 CAACAGAAAATTGCGTTTGAGTGTCAAGATTTTTCCTTTAGTCCCCATGC 
AGCTCCTTAGAATGAGGTGGCATCrrCTCCCTTTTCATAGGTGAAGAA4C 
AGAAGCTCTGGAGGAACGAATCATTCATCCAAGGTCAGGTAGCTAGTAAG 
CGTCCCACCAGCTCCCCAGATCTCCTGTTTCCTGTCCCAAGTCCCACTGA 
GTGAGCTGGAACAATGGClTCAC ' rGGCACCTGCCGGGAATGGTGGCAGGT 

25 GCCTATAATCCCAGCTACTCGGGAGGCTGAGGCATGAGAATCACTTGAAC 
CC 'GGGAG GC /\ GAGGT4"GC A G G GACfCC AA GATC A C A CC ACTGCACTCCAGC 
C TGGA TAACA A A CGGAGATTCCATTTAAAAA AATT AACATATAATATACA 
T-ACAGTAACAT4 X^ACTTTTT A AGT^^ 
&TATATX3GTr 4TATAA CCACC^ 

30 CTCAAATAATTCCCTCAGCCCCACCTCTTGCTGTCAA TCACTT C TCC CAC 

TTTCTAG AATGTTCTATAC ATG AG ACC ACTG AG A ATATAGTCTTCTGTGT 
CTGGCTTCTTTC ACTT A AC ATA ATGCCT AGCTC AGC AGTGTGTC A ATCCT 
CCCTCCCTTG C C ATTGCTGAG C A G TG AG TATTCCACTG TAT GG CTGTGCT 
35 ACGGTGTGTTC ATCC ATTT ATTC ATTC ,<\CC AGCT A ATGGGC ATTTGG A TT 
GTTTCCAGGCTITGGCTATGATGAGTGAAGCTGCTGTGAATGTTCAAGTA 
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SUBSTITUTE SPECIFICATION-MARKED UP VERSION 



CAAGTCTTTGTGTAGACAGGGGTTTTCAATTGGCGGGATAAATACCTAGG 

CTGTTTTCCAATGTGGCCTGTACCATGTTGCATTTCCATCAGCAGTGTTT 
GA G AA T TCCAATTGCTCCACATCCTCCTCCCGACACTTGGTTTCACCCAT 
5 cmTAAATATTAGCCACTCTGGTGACTGTGTAGTGATATGTCAGTGTGG 
TTGTAATTTGCATTTCTATGATTGACTAATAATAATGTTGCAGATATTTC 
TGTATGCTTAGTGGGCATTTITGGTGAGTTTTTAAAAATTGGGTTGTT G T 
CACCGTCTTATTGAGTTGGAAGAATTCTTTA TATGTTCT GGATGTTTATT 
CATG T GTGTGT CT G CT AAGAGGTGAGACTGG TT C T ACCCTGGTCCTAACA 

10 AGCA CC CTGGGCCTGCATCCCTTTTTG TGTCTGTG AGCTGGGTCTGCAG C 
CCTCTCCTCCCACTACCTACTGCCCAGCAGTACCCCTCACCCATCACTGT 
GGCTCCTGCAATGACATCTCAGCCTGTCTCTCCCTCCCTCCAG CT AGCCA 
GAGGCAGGATGGCTCAGTGAC ACAGGGT GGGCCCTGAAGACAGAGTGCCA 
G GGTT TG G ACC TT GTATTAGCAA GAGTC A C AAGGGAAACTTACTTTATCT 

15 CTCCATAGCTCTGTTGTGAGGATCCAATAAATTAATCCATAGAAGAGCTT 
AGGACAGCACCTGGCACAAAGTATACATGA GCT A TT A T GATGTTA TT C T T 
C CAACCCATTGTTTC TGTGTTGTC A T AAACATGAATGCAGGACTCAGTGT 
CCCAGCTCTGTGTCCCTCGCATACATTCCCTAACAGCCCACAGGTCTTGC 
CTOTGA(XGCCXCAl^-AA^I^AX3m4TO 

20 G G CCTTGC ATTGGACAT TTC TGTATCCATATTTGTTTTTTAAAAACTAGC 
T G1TGGCCGGG CGCG G T GGCTCACATCTCTAATCCCAGCACTTGGGAGGC 
AC3 A GACAGCTG G AT CATGAGG TC AGGA G TTC AAGG CCAGCCTGGC C A ACA 
T GG TG A AACC C CATCTGT AC AA AA AATACG AAAAT T AGCTGGGCGTGGTG 
GCA TG C ACCTGTA A TCCCAGCT ACTTGGGAGGCTGAAGCAGGAGAATCGC 

25 TTGAACCTGGGAGGCAGAGGTTGTAGTGAGCCAATATAGCGCCACTGCAC 
TCCAGCCTGGGCAA C ACAGCAAAA CT CCATCTCAAAA AA AA A AAAAACAA 
AAAACAACCTAGCTGGACTTGACACTCTTGTTAGAGGAAGATTTTTCCAC 
ATCTGTTAACTTTTCTTCTATl'GTT ATC CATCTGTGCAGGTTTTTCTGTC 
CTCCTGAGTCATTrTGATAATTTATATTATATTTTGAAAATCATCCATTT 

30 CCTATAGTTGTTTATTAGTGTCTTCTCTGTTATATTTGATCAGATTACCA 

ACAGGGTCTCACTCGACAGCCCAGGCTGAAGTGCAGTGGTGCAATCATGG 
CT CA CTGCAGC CTTGACC T C CTG GG CT CAAGCAATTCTCCCACCTCAGCC 
TC C TGAG T AGCTGGGA C CTCAGG C ACACGCCACCACAGCTGGCTAATA TT 
35 TTATTTATTTATTTATTTATTTATTTTTGTAGAGATGGGGTCTCACTATG 
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SUBSTITUTE SPECIFICATION-MARKED UP VERSION 



gcttcccaaagtactgggattacaggagtgagccaccatgcccagcccct 
atttactttatagtaagtgccttcatgggcataaatgttcctctgagaca 
gctttggctattagccatacttttaatattttgtacattcatggttattc 
atttataaatggtctgtaatg c aatgcagatttcccctttggcccaaatg 

5 cca tltac a g cagcx^ ctrttctctrtctgagca g a ca g a atattttggtt 
tcccctctgttgi ttatttctcgtctgc ctcgcctcatttgctag gtgtt 
cccct^tgtgccttaagtatgagccactcaaatatttctgtrtctctaa 
acacccctgacactgtcctgctggtttctctatctggaatatccttccct 
3cteg g c cag r n ^ c c ccctagtgcatcaaagaaatcctgctcttttgcctt 

10 c ag a a a ac a aa ac a a aa cg a a ac ct atc agtctcctt atgtcccc aa ag a 
catagctttgctggtatctggttgtattgagctgttcatttgtctcttct 
gctagatggtaagct cc ttcgaaact a a a aactaatcacttitctaactt 
c ag a ctg a gc a c a a att aggttctc a aga a ac attg a at a atg agtg atc 
c ggtatccccttccaacatatttrtggtcattgataccatcattctgagt 

15 agttactaggg a ac acttc actgc agt a acc a at ac agc a a a ac gt g aaa 
-ta cag tt acatagta g aat tgta tt t ctt gccc a tat a atagtca a gt g c 
agttcltcatcag ctgggag gtrctcctccacacagtcattraggaatcc 
a g ggaacatag cagaggttgctack:tctagacccaaacccatgtcctctt 

T-QTCCACAgEGA^QAC^ 

20 tctcagcct ccc tc g cagtgagatgtctccatgcaatttcagtggagcaa 

cttcatgttctttctcacctttccgtaggctagctgcagataatgatgag 

gctttagggagtgggtggagccataaagtagaagcctggattcctaaatg 

a cggtgtgaag tgttccctaatttcacgtaattgttt ctt aa tt tcc tgt 

25 ttgggttatttgttgctaaggtataaaaaaaccctgatttttgtgtgttg 
atatttgtgtgct gc a actt tg ctg aat t agcttatt ^gctcaatttgat 
ctcagatattagctcaaatattttgggagattatttatggttatctacat 
a agatcatgtcatctgaaata aagat a gt tc t am cctt c t ttc t atc t 
tagtccatttgggctgctgtaacaaaatgccataaattggaggctgagaa 

30 gtccaaga tcaaggccca agctaattcactgtct gatgaaggcctgcttt 

C TG GTTCATACATGGCACCTTC T AGC TG TGTCC TCACA TGGTGnAAA 4GG 

caaggtagctctc t g g ga tt cctttttgtttgtttgtttgttttgttgtt 

^ ITGTT T GATTTTTTGA CL^CAGAGTCTCA4^^ TCAC C AGGCTGG, 4GTG 
CAGT GG CACA A TCTCGG CTCAT T GCAAC CTCTG AC TC CCTGGTI^CAAACG 
35 ATTCTCCTGCCTCAGCCTCCTGAGTAGCTGGGATTACAGGTACCCATCAC 
CATGTCCAGCTACTTT r lTGTAlTTTTAGTAGAGACAGGGlTTCACCATGT 
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SUBSTITUTE SPECIFICATION-MARKED UP VERSION 



TGGCCAGGATGGTCTCGATCTCTTG A CCTCGTGATCTGCCCACCTTGGCC 
T CCCAAAGTGCTGGGATTACAGGCATGAGCCACCGT G CCTG T CCTCCGGT 
ATTCTTTTTATAAGGGCTCTTTTTCTTTTTATGTGGGCTCTACCCTCATG 
A C CT AG C ACCTTC T AAGGC C C C A C C TCT TAATA TC ATCACACAGCAGATT 
5 TAATATATGAATTTTGAGGGGACACATTCriTreCATAGCACTITCCAGTA 
TGGATACCTTTTATTTATTTTTCTTCCCTAATTGCTTTGGTTAGAAATGT 
C nrCCTAArrGCTCCACTACTATGTTGAAAAGAAGTGGCAAAAGTGGGT 
ATTCTTGTCTTGCTCCTCTCTTAGGAAGAAAGTTTAAGTCTTTTGCCATT 
AAATATGACGTTAGCTATGGGGTTTTCATATATGACATTTATCATGTTGA 

10 GGAAATTrTCTTCTTGTTTCAATGATGACAGGGTGTTGAGTTTTGTCAGA 
TGCT'rTTTCTGCATCAATCAATATGACCATGTAGTTTCTTTGTTTTATTC 
CATTATTGTAGTACATTACATTAATTTTTGCATGTTGAACTATTCTtGTG 
TTCCTGGGATAAATTTCACTTGGTTATGGTGTATAATCCATAACCATAAC 
C-TG- AA GATATG CTGAAGAGGCTAAGTGCCATGGCTCATGCCTGTAATTCC 

15 AACACTTTGGGAGGCTGGTGTGGGAGGATCACCTGAAATCAGGAGTTTTA 
GAA GAGCCTGGG CAAGTAAACAAGATCCCATCTCTACAAAAAATTGAAAA 
TrACCGCTGGGCA T GGTGGCTCACGCCTGTAATCCCAGCACTTTGGGTGG 
CCGAGGCAGGCAGATCACCTGAGGTCGGGAGTTCTAGACCAGCCTGACCA 
ACA3^^A-AAA<^^^ 

20 TGGCACATGCCTGTAATCCCAG C TACTCAGGAGGCTGAGGCAGGAAAATC 
AC TTGA AC CTGG GAGACGGA GGTTGCA GCGAGCCAAGATCATGCCATTGC 
ACTCCAGC CT GGGCAACAAGAGCAAATCTCCGTCTCAAAAA A AA A AA A AA 
GAAAAGAAAGAAAGAAAGAAAAGAAAAGAAAGAAA A TTAGCTTGATGTGG 
TGG'ITGTGCACCTT r AGTCCTAGClACTCAGGAGGCTGAGGCAGGAGGAT 

25 TGTTTGAGCCCAGGAGGTTGAGGCTGCAGTGAGCCATGATTGCACCACTG 
CACTCCAGCCTGAGCAACAAAGTAAGACCTCATCACTAAAAACAAATTTT 
TTAATACTGAAGAATTTTATTTGCTGGTATTTTGTTGAGGATTTTGCATC 
TATATTCACAAGAAATATTACTCTGTAGTTTTTCTTCTTGTAGTATCTTT 
GTCTGGTTTCAGTATCAAGGCAATGCTGGCCTCA'rGAGATCAATCAGGAA 

30 GTGTTACTTCCTCTTTTATTTTTTGGAAGAATTTGAGAGAATTGGTGTTA 

GG ATTTTCTTTGTTG GG AGGTTTTTT AGT ACTA ATTCC ATTTCCTT ACTT 
GTTATTAGTCTAATGAGATTTTCTGTTTCTTCTTGAGCTAGTTGTAGTAG 
CTCATGTGTGGAATTTTTCTATTTCATCTAAGTTATCCAAGTTTACCTAA 
35 GT TA AA G TTCCATTTTATCTAACTTGGGTAAGCCAACAAACAATACTAAA 
3WrTCATAGTATTCTCTCATAGTCCTTTTTTTCTCTAAAGTCAGTAATA 
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SUBSTITUTE SPECIFICATION-MARKED UP VERSION 



ACGTTCACTCTTTCATTTTTTCATTCCTGATTTTAATAATCTGAGTTCTT 
TCTCTCCCCCTCCCTGCAATTGAGAGTCATTTAAAAGTGTCTTGATTAAA 
TTTTATATATCTGTGAGTTTTCCAGTTTTCCCTCTGTTATrCTCTTCTAG 
TTT TATTTCATGTGATCCAAAAA GA TACTTTATATGATTTCAATTTTTTT 
5 ACA 1TTACTAAGAC1TGTTTTGTGACTAAAATATCCTTGAGAA1TTCCAT 
GC AC ATTTGAG AAA A ATGC AC ATTCTGCTGTTGTTGG AC AGAGTGTTCTG 

TTCCT T ATTGATC TTC TGTCTAGTTGTTTAATCCATTATTCAAAGTAGTG 



10 GAGGGTGGATCACAATGTCAGGAGGTTGAGACCAGC C TGGCCAACATGGT 
GAA ACTCCGTCTCTACTGA AAAT AC AA A AA ATTTGCTGG AC ATGGTGGC A 
CACGCCTGTAATCCCAGCTACTCAGGAGGCCAAGGCAGGAGAATCACTT G 
AACCCAGGAGGCAGAAGTTGCAGTGAGCTGAGATCGCACCATTGCACTGC 
AGCCTGGGCAACAGAGCAAGACTCTGTCTCGAGAAACAACAAAAACAAAA 

15 ACAAAAAACAAAGTAGTGTACTAAAGTCTCCAACTACTATTGTAGAACTC 
TATTTCTCCCTTCAATGTTGCAAAiVTTTTGTTTCATGTATTTTGGTGTTC 
T G TTG T TTA T AATTTTTATA T CTTCTTAATGGATGAAAACTTTTATCAAC 
ATATAATGTTCTTTGTCTCTTGAGACTTTTTTTTTTAACTTAAAATCTAT 
TTC K3G CT GATAATACAGCCACCACAACTCTCATATTGGTTGT TATTFICA 

20 TAGAATATCTrCTTCCATCCTTCTACTTrAAAATTCTTCTATCTTTATAT 
CTA AA GTGAG C CTCTTGTA GA T AGCATATAGGTGGATAATG TT CTC TTT A 
- TTCACTCTGCCAATATCTGCCTTTTA - ACTGGAGTTTAATCTATTTATATA 
TA A A AT A ATT ACTG ATT A GG A AGG ACTT ACTTCTACC ACTC AGCT ATTTT 
TTTTCTGTGTGTCTTATACATTTTTAAGTTTCT 

25 GATTTTTTTTTTTACTTCTTGATTTTGTGTCTGTGTTGTTACATTTTGAT 
-T-ATTTTCTCC T TTT GATAGCGGC A GGA GG CAGCCAAATGCC T GGCAGATA 
GAAGCTTGTCCCCCATGAAACC C CACCTTCAAGCCAAAAAATAGCCTGAA 
GG CTGAAAGACCGGACTGCTGGTCCCAGATGAAACCCATGATCCAGAGTG 
AGAACTTCCATTCCTGTTTGCCTGCCCTCTAAATAATCCCTTITAACCA^ 

30 TCGAATGTTGCCTTTTCCAATACTACCTATGGCCTGCCCCTCCCCCATTC 

TTC AGGT AGGGGG ACCACCTCTGTATCCCTTCTCTGCTG A A AGCTGTTTT 

€AT£A€4^AA-TGA A 

CXT4^AT4^^^ TTGGGTGTGGTx4CAAGAACTC G GGAACCAGTG C ACAAGC 
35 CAGACTTGGTCTGGGCAGCACGGGTTAGTGGGCCATCTCCCACAGCAGGT 
A GC A TG G CCAAGmA G GCCTGGGC A G G GC A T CAC CA AGGTCCCTGGCTTG 
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SUBSTITUTE SPECIFICATION-MARKED UP VERSION 



C A A AGTG ACC A AGG A AAA A ATCCTGTGTC ACTTTCCTTTTCTC ATATTTT 

CTTATAAGAAGCTAGTTTGAATAATAGTTCCAATAGTACATGAACACTCT 
ACTCCTATA^TCTCCATCCTTCTTCCTTTATATTGTTATTCCCACAAAT 
5 TATG'rT"rn\^TACATTATATCCTCACTAACATAAACTrATTATTAT'rTrC 
TG C A T TTGCCT TTTAAATCA TACAGGAAAACAAGAATCACAAAGAAAAAC 
TACATTAATATTTGCTGTTATATTTACCTATATAGTGA C A TTT AACAGTG 
TAnTTTATGTCTTCAGATGTCTTTGAATTACTACTTAGTGTCTTTTCAT 
T'rTA G CC TCAAT GTTTCCCTTTAGCATTTCCTATAGGGCAGGCCTGCCGG 

10 TA ATTA ATTCCCTTTGGTTTTCTT TATCTGA AATGTCTAATTTCTTTTTT 
ATTCTTGAAGAATAGTTTTGCTGGCTATAAGATTCTTAGTTAATAGTTTT 
TTIXrCCAGCACTTCAAITATTATTAAAGTGTTATTATTAlTATTA T TATT 
ATTTTGAGATGGAGTCTCCCTCTGTCACTCAGGCTGGAGTGCAGTGGCGC 
AAT CTCT G CT C AC T GC A ACCTC CGCCTCCCAGG TTC AA G CAATTCTCCTG 

15 CC TCAGCCTC C C G AGTT AGCTGGGATTACAGGTGCCCGCCACCATGCCCA 



CTGATCnTGAACTCCTGACCTCAAGTGATACACCCACCTTGGCCTCCCAA 
AGTGCTGGGATTAGAGGCATGAGCCACCATGCCTGGTCTAAAGTGTAATT 
ATTATTAC A GCTGCCATTTGGCCTCCTTGGTTTCTAATGAGAAATCATCT 
20 GTTAAACTTATTGCAAATCCTTGGTATGTATGCTATGTGTCATTTCTCTC 
3::iX;CTGC:m;CAAGATTCTClX:iCTGTCTTTGTCTTTTGACAATTTGACT 
ATA ATGTGTTTC AGTGTGA ATTTCTTAG AGTTTATCC C A C TT GG ATTTC A 
TTG AGCTTCTTGG ATGTGTACGTTTGTCTTTC ACC A AATCTGGGAA ATTA 
TITCACCATTTC 

25 CTGGAGCTCCCGTATACTTAGTTGGCATGACTGATGGTATCC TA CTGGT C 

GGATAACTTCAATCGCCTTTTCTTCAAGTTCAATGATTATTTCTTCTGCC 
TGCTCAAATTGQCCATTTAACCCCTCCAGTGACTTTTTCAT^ 
gEACTTTTCAGATCCAGAATITCTAT^ 
30 TTTATTGTCATTCCCCATCTGTTCATACATTGCTCTCCCAATTTCCTGTA 
GTTCTTTGTCCATG GTT T TCTTT A GT TAAT^ 

GACTTAATGTCTTTGACTAGTAATTCCAATGTCTAAAATTCCTTATGGAT 
AGCTTCTTT T AAATTArTTTTGTCCTGTTAGAGAGTCATATCTTCCTCTT 
TATTTGC TT TG TA ATACTTTGTTGAAAACTTAACATTTTGAGTAGTAAA A 
35 TGTGGTAATTCTGAAGCCAGATTCTCCCCCTCCTTTGAGATTGGTTTTGT 
TGTT TGTO 
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SUBSTITUTE SPECIFICATION-MARKED UP VERSION 

ATTTTTGCAAAGTATGTATTCTCTCTTGTGTCTGGTCACTGACGTTTCTG 
TTCTGGTGCCTCTGCAGTCAGCCTATGACCTGGAAGAGCATTCCTTAAu/VT 
GCATAGATTTTTTTAAAACCCAAGAAACAAAAAACCTAGCATGTATGTAC 
CTTTTTAAAAATCTTCTGATAGATGCCACCTGGAAGGCTGCTGCTGCCTG 
5 AAGGGGCAGAAACAAAGGCAAGCTCTACTCTGAGCCCTCAGGGAACCACC 
AG AT A A AC A A A AG A A ATTTG ATTCTCC A A ATTTCTGG AAG AC A AG GTCCT 
rrCTGCCCACTCCTGC'rCCAGCCAGCTGCTCTAGGAACACAATTACTGTC 
CACATGGCCACAGGAATGTTGAAGAATGCAGGATGGTAGCTGGTTTGCCC 
ACACCACTCACTTATGAGCCATCAGCATGCCTCTCCCTTCATCGAGCACT 

10 CCCATGGTTGCTGTAAGTGTCCAATCAGGTTCCAGAATTCTGAAAGAGTT 
GACTCTTACAGGATTTTTTTCTTTTCTAACTTGCTGGTTGTTTAGATAGA 
GG AACCAATTCCTGAAGTTTCCTACGTTGCCAGCTrCATGAGGATCATI'C 
CCTAGTAACTCTTTTCAGACAAAAAGCTTCATTGATTTACTGTAGGAC T A 
G CATCAAAGAGTCTATGCCACCTAGTCTGTCTCCTTAAAACACAGAAATA 

15 ATCAGTATGCATTGGGGTAGGAGTTTGGCATTAGATCTGCCGTAAATCAA 

CITTGGTAGCAACAACAGCTACAAACTATTGAACAACITGTATGTGCCAA 

GAGCCTTACCTGCATTATCCCATTGAATCCTCTCAACAGCCCTGTGAGGT 

AGTAGAATTGTTGCCTGCCCCTTACTGAGGCCTAGAAACATTAAGGAATT 

20 TGCCCGAGGCCCTAGAGCCAGTGAGTGGCAAAGCCAGTCTCCAGACTCAG 
GCTGGAGATCCTACAGTTCTGTGTTACC CC AGTG T TATCCTGCC TCTC A G 
CAC AGAGTCT TG GATGATTC T CCTAACCCCTCCCTAGGCAATGCACAGGG 
CTGCTCCCTGCACCCTTACTCATGCTCTGCTCTTCAACCCCAACAGTGCT 
GG CCTTAGGG TT rA TC C CTGAC A CCCAGCCCCA GGCTCCATTCCATCTGT 

25 TGACAGAGGCAAACACTGGGGCAAAACTGACCTCTGTGGATACCACTGTG 
TCCAC CTCCA CCAGCTTCAGCTGAAGCCTCTGAA CAT C T CCAGCATGGAA 
GAAGCCCCAAAGGATATTTCCTGTCCCCCAGCATATGCTTGACCCTGAAG 
CCCTCCCCATCTAGTCAAGAAGACCAAACTGTTAACAATCCTGGAGTCAG 
AGTGACCCATGGGTGAATCrrAGCCAAGTCACTCATAGCTGTrGCATCCT 

30 AGTAAATCCCTTAACTCCCATAGGCTTCAGTTTCCCTGCATATAAAATGA 

TCCCCTGGCATGTGGAAGCCACAGGGCACACACTAGTTGTGGTCATTTGA 
TCCTGGCATGCTCTGCTGTCTCTCGGCTCTCCCCTTGCCTCTn^CCCTGA 
TGTCCTGGCCATCAGCCACTGCCTAACACCCTCCCACTCACCAGGCCCTT 
35 AGCCTGCCCCTTAGCACAAGAGCACAGCCGGTCTCAAGTCTACCCTGCTG 
TAAGCAAACACTTGCAACATCATGCTGACCTCCAGGCCCTGTTGCATCAG 
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SUBSTITUTE SPECIFICATION-MARKED UP VERSION 

CGTGCCCACACTTGGTGCCCAGCTGGTACTGAGGGTATCAGGGAACAGGC 
CAGTGGTGGAAGGGCGGACACTTT'GGGTTCCCTGGTTTCCTGGCTCCCAA 
TATCTTTCCCAATGGCATATGGGGTCTAGCAGCTTGGCTCATTTAACTGT 
G A ACCTCT ACCCTTT AG A ATCTGGGCCTCC AGGCTTGCTTCTGTGC AA A A 
5 TGGC AG ATA A GGCTC A ACCTTTCTLTTTTT A ACTfC ATTGTT A A AT ATT A 
CTCCATTAATACCCATTTACTGCAGAAAAGGTAGGAAATACAGATAAGCA 
AAAAGGAAAATAAATTAAAATCCTCATACCACCATCATCAAGATAATTAC 
TGTCACCATTTTGGTATATTTCCTCCCAATACATATATTATCTATATCGT 
A:rAT-ACGACAAAAAT G G A TC A -TAC'T A TGTTTCCTGTTCTT C C C CT G TGT T 

10 AGTCATCTATTGCTGTATAACAAACTGCCTCAAAACTTAGTGGCTTCACC 
TTTCCGTGTATTATGATGACAAGAATGTGGTATGACACTGTCTTATATCT 
GGATCATATGCTAAAAGATAGAAAATGGTTTCTAAACTTATTTGTTCTGT 
AATAACAAAATTTTATTTCATAAAGTGTTTTTAAAAAAAACCATAGTAGC 
TTGAAACAACAAACC1TTGTTATCTCACACAGTTTCTGTAGGTCAAGAATTCAGAAGCAGCTT 

15 AGCTGGGTGGTCTGGCTTGGTGTCTCTCCTGAGGTCAGGGTTTTGGCTGGGGCTGCATCACCT 
GAAGGCTTGACTGGGGCCAGAGGA 

CTGCTTCCAAAGTGGTCCACTCACATGGCTGGCAAG-TTGGAGTTGCGTATTGGCAAGAGACTT 
CGCTTCTTCTCAATGGATCTTCCCAGAGTTCTTGTAGGCAACCTCATAGCATAGCAGTTGGCTT 
CCQXAaAGGa AAC AGTCC A CiG AG A GAAC A A GGCAGAAACCACA GGGTCTTTT CT G GCTT AG 

20 GCTCCAAAGTCATACTCCACCATTrCTGCATTATCATATTAGTTACACAGGCTAGACCTA 

TTCTGCATGGAAGAGACTATACCATGGG GT GA ATAC CAGAAGCAGGGCTATTGAAGGCCAGC 
TTCAAGGGCGGCTACACATTCCCT.rTCAACAGTA TG TCATGAACATCTTTCCATGCCAATAGA 
GCAGATGAATCTTACCATTTTTAATGACTACATGTAAGTGTAGCATAATTTATTTAACCAACC 
ICC- TGTAGTTG GGTATGTGGGTTGTGTCTCGTTTTTTGATAGTAGAATTAATCATCTTGAA 

25 TATCCATCACCAAACTTGTCATATTATTTTCTTTTGATGAATGAAAAAGAAAATCAAGTCATG 
- TCTGTCAATCAGAACCCTGAGCAACTAAGAAATGGGGGTACCACTGGGACATAGAGCAAGGT 
CCCTTCTGATTCTGCTCTTGTCTTTCTCTCCCCATGAAATGGGGAGTTCACTATCTACTGAGAC 
ATCCTAGCCCA CAGCTGCACAGTTCTGTCTTTTTAGAAAGCTCTAAGCAGAAACAATGTTC 
A TCCATCCTCCTCGGGACAGCCC r n'GAGCTACTGAAGACTCTAAGCATGTCCTGGTCATCCTCC 

30 ATGAGCCATCATCTCTGAGG CCCT CCCCTTCTTGGCCCCTCTTCTCTGGACAGGTTCTGGACAG 
Tg^ GCCCTTCCAAAA TT C CTX jAA A GCAGGAACTGTTCCT 

GCAGTAC AGACTGTTGGTGTCACC'CCTTATCCTGAAGAAGAGGCACTGAGACAGGAC 
A AGGGTCtGGTGCCCAGGAGGGCTGGCATGAGTCATGAGAATCTGGTCCCGGAGAATTAGACG 
G- TGTGGGGAAGTAGGGGTGTTGGGCCGCTTTCTGGCCTCATGGATGCCAATGAATATCAGCA 
35 GGTGGCTCCCAGAAAGGAACrCTAGGGGATGCCTGTTGCTCTAAATAGAGGCTAGAGAGGGC 
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SUBSTITUTE SPECIFICATION-MARKED UP VERSION 

CAACCAAGAAAGGGGGCCCACTTGCCTCAGCTTCAGGCTTTGTACACATC 

CTCAGCCTTTCXrGAGAACTGAATTTAGA T TC T CCTCCCCTGTGCTG TGT G CTTGGCCC A G A AG 

AAGGGCAAGI^CTCGCTGGGTGGCTGCTTCTTGGCCTGGCTGAACCAGAAGGCCCCAGTGCCAC 

5 GGGTCAGTGGCCTGTCACATGAGATCTGATGAGGCCATCTCTGCTCTATATTGGGAAAGG 

GATCAATTGTATCAAGGGCTTTCTTGGGAGTGATCACTCTGGCCATTGGCGAGAGACCTGGCA 
-rrCTGACAAGGCACCCTCCATACCCTGACCCACriGCCAGCTCCAGCTAATTITAGCAGGCTrT 
GGCAGGTGCCAGCAAGTACATAGCATGTGGATGTCACTCCCAGGTGAGCCCAAGGAGAGGCC 
TGGGCCAGAGC CTGGAAGTCATGGTCTATGCCCATGGAGGCACCCAAAGCAAGCCTGAGG G 

10 CTGGACTTTGCAGTCACAAAATTAAGAATGATACCCCTGTTTTTTGTTTG 
TTTTGATCAGTTGGCCACCTTCCTCCACCACCCCTTCCCCAAGTTCCA T 
CAGACCC' CTGG A'ITGTAT GAAA T G CA A AT CGA AC C TCT CTGCAGATGAA 
AATCCACTGGGGATCCCCTTGCC T C C AAGAG CAAGTC CAGACCTGCACCA 
GCGCGGGCCAGGCCCCCTTAGGACCCCCTCCCTGTCCAAGGGCATTTCAG 

15 TAAGTGTTCTGTGGCCAAGGCAGCCTGGTGACTTTCTGCCCGCACAAGGC 
T GAGGAATG CiAAGA TCGGTAGGCTGGCTCTGCACACCCCCTCCTGCTGGG 
CAGCAATCCCTACCCCATGTrCACAGAGTGTGGCCGGCTGCCCCATGGCT 
CTGTCCCCGTGGCCCTGTCAACTGTTACCCACATGGCCTACCCTCCCTTT 
C TGCC C TGCCTCTG ACCCCATGGCAGGGGGC A GAGTA T TTGAGCAGCCGC 

20 CAGGCTGAGCCCTTTCAGTGCAGAAGCCCTGGGCTGCCAGCCTCAGGCAG 
CTCTCCATCCAAGCAGCCGTTGCTGCCACAGGCGGGCCTTACGCTCCAAG 
GCTACAGCATGTGCTAGGCCTCAGCAGGCAGGAGCATCTCTGCCTCCCAA 
AGCATCTACCTCTTAGCCCCTCGGAGAGATGGCGATGGATGTCACAAGGA 
GCCAGGCCCAGACAGCCTrGACTCTGGTAAGGGTCACACCAAAGTTAGGG 

25 ACT TT GCACTGGGAGAGCAGCACCCAGGGCAGGGCCCTTGGTTTTGCAGA 
CTACCA A A A GTAAGGCTG G GGG C AG GG AA GGC G A GC AGGCTTGG G GCA C C 
TTGGAAGGAGGC^ACATGGGCCTTGGGGGTC CTGGCTAG GGCAGCTGTGCCTGCCA C TGG C C C T 
CTGCCCACG A CC C CTCCTCACTGTGGCTATCCA GT G T CCA GC CTCT CGAGGGGT TCTAGGGTAC 
^ rA 'ITCCTC^ \ GC TAA CGG 

30 GGGGGGAGCT GGCTGGCTGCCCAGGGCTGTCTGGCTCTCTGGGGGCTCTGCATGGCATTT 

CCAGGGGTT G GTGGATCACKSGA TT CT GT C C CTCAGGAGAATGTGGGC A C T AGCC C AAGG C X I A 
CTCACTTCTGTGTACATAGCCACCTGAGGGCCCAGGAATGGAGGGGGCCAGGCTACAGCTGG 
ACA TCTGGCACTCGGATGGGCTCTGGAGCCCCCAGGCCTGCAGCATCTGCCCAGGGACTGCCC 
TGGCCCTTGGCCA TTTCCTCAGGGACCCACAGCTCCACCAGCCGGCCCCTCCCAGTGCTGGAA 

35 TAGACAGTTCCTCAGTCCACATCTGCCAAAGGCGGCACTAGAAGGCATCCTGCCTTTTT FACX 
GC GTTCTGGAGGTGGGGTCACAAAGCACTGCTCACTGCATAAAAGGGACAGCATCCTGCCCCT 
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GGCAGCCCTGCCTGACCAGCTCCGCCTCTCCCACTGCTATCCAACCTGTACACCCTGGTGACCA 
TGTCOvGGCC-AGTGGC CTTA AGG A C TGTCTCTGT ACTGATGGCTCC AC ATCTACCTCTCC 
AGCCAGACTCTCCTCTGAACTCGGGCCTCACATGGCCAACTGCTACTTGGAACAAATCGCCCCT 
T GGCTGG C AGATGTGTTAACA TGCCC AGACCAAGATCC C AACT C CCACAACCCAACTCCCAGG 
5 TCAGATGGAACCTCITCITCCCAGGCCCITCTGTTCCTCTCC T CAGCCCCTCCCACCT^ 
GAATAAGT CTAGACTCTTATCGCTTTCACCAAGCCTGCGCCCAGCATCCCTGCACAGG 
GATTGlTAGGACAGCCTGACGCCCTGCTrCCACCCTGCCCCAAGATGCCCCTGCTCTGCAGCCC 
GGCGCCTCCAGGCTTCTCACCTCCTGCTGCTCACAGCTCAGCCTCACTCCCTCCCTCCCCGCCTC 
T GCTCCAGCCTCAGTGCAGGT 

10 CCCTGCTCCCATCTTCTGGCAGCAGCTGCCCGACCTGGTCCCTCTTCATCTGTCCCCATTCCTTC 
ACCCCCCAGCCTGTCCCCAACTTGACTGAGGTTCTTTCCTGCAGATCCCCGCCCTTGAGAGGGG 
TTGGTCCCACTGTCAACTCTGCTTCTGTGCCCTGTGCCGCACCTGGCATTCAGTGAGCATCTGC 
TGAAGA GATGAGGGTCAGATGCCCTGCAGGGAGTGTGGGGGCGTCCTCAGGCAAGA 
AAAGTTGTACGTTTGGCTGTGGGCCCTGArrATGTGTCCTGTGACCTCTTGGGTGAGG TGAGG 

15 AAGAGAAACCTCTGCAAGCTGGCTGGGGCTGCCTCCCAGAGGCTGCCAGGGGGAGGGACAGG 
C TC T GT C TG TG C TCTTCTTCCGAGGCTACA C C T GGGGCGCCAGGCTCTCAGGGCTCCCCAGGTA 
C C AC C ACA 'nTC CTACACTGCTTGGGAAAGCCCTGTAAGTlTGCACAGACACCCAGCATGA 
GGCTCGCCAGAGAGATACTTGTAGCTGGGGTCTGGGCACCAGGAACAGCTTGGTGCTGGGCC 
TOAAGIC;GGGeAGGATG^ 

20 CAAACTCCTCTGTGGCCTATGGTTCTGTGGGCTTGGGGAAGGGTTTGTACCTCTGTGTCCAGT 
TTCCTCACTTATA 

AAAAAAGGAGATAATAAAAGTACCCATGTCCCAGGGTGGCTGTAGCAATA 
A T AG G GAGGGGTGCCCAGAGC A GG TC TGGCACACAGGAAGTGTGCATCAG 
GCTCAGTCCCTGCCATTGGGCTTGTCCTGGGAGTCTGTGAAGCCAACCTC 

25 TGCTCCACAATGTGACCCCCAGGCTTGTGAGACCAAGCTGGGTCAGAGCT 
TCCTCCICTG^ jGGTTGCACCAGGAGGGGAACTTCTGCAGGCCCAGATGCA 
CCCTGAGGAAAGGGCTTGTTCCCACCAAGAACAAGGCTCACCTTTGGAGG 
ATGCTCCCCACATGAGAGGTGAACCCCCAGGTCTACTGGTGACTGCAGCC 
TCGGAAGCTGACAGCATCTATCCTCCAACCCATGCCCACTGGGAAGTGTG 

30 TGAGGGGTCCTCATAGGCCCTGCGGTGTGGACAATGCAGAGACCCTGTAG 

GCCGGTTCTGCCACTGTGGGGAGACAGGCTCCCCCACCCCATGTCCCCTG 
CTTCCCTGC AGCCC AC AG AG A AT AC AG ACCTACTTTTAC AG AA ATCC AG A 
TTTTTGTGTAAAAGTGTCTCTATTTTAAGTAGATTTTAAGTGGTGGCAGC 
35 AAATTTAAGCTTTTGAGAATATTATACAGAACAAATCAGATTCACAGGCC 
AGATOCAAOITAITTA^ 
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TCTCACGTTTTCACTTATGCCTATACGTCTCCTTCACGGGAAAGGCCACA 
AGAGGCCCTGCGGTAAGTGTCCCGGTGTTGATTTAAAGTCCCCAACAGTG 
AATATGAGGGTCCTCACTGTTGCAGCAAGAGGATACCCCCCTGTGTATCT 
TGGAAATGCCTGCAGCCCTCTTGCTGCAGAA C AGA T TCTTAGGAGAGA A A 
5 CTG TCAGAJGA AAG'n - AAAC lTAG AGAAACTCCAAATrGCCCTCTGAACA 
G AC GGTATCAG TTTGACATCATCCAATACCGGGATTCCTCGGGGAGAACT 
TTCTGGCCTAGAAGGCAGTAGAGCCAGGACTTCACCCAGTCAGTGGCAGG 
GCCACACGTCGGC CITGAT ACAGAGGGGGAAGACTTGAGCCTCCTCGACA 

10 TTGTACCTAGC T C TC CTTATTCTCCAAGGGCTGGGCCAG r n'CTCCCCACA 
GCCCTGCAAGGGAGGATCACTCAAGGGCCCCAACTGTCTGACAATACAGC 
CACACTCTGATCAGCCACCTGGGCATAGGCTCCATGCCATTGTCCTCCGC 
CAAGACCTCAGACTGAAATGTTGGCTCCTCCCATGAAGAACCTGGGGCCA 
AAGGACCA G AGTCCAGG T CCGTGGCTGCCAGGATGGGCCACTTGGAGAGA 

15 GGCACAAGGGTGGTGCCAGG C AG G TGTGAGGGCTGGACCTTTGCAAGAGC 

agcatcacttttgttgagagccc ac aggtatcttataattgggtcctagg 
actrc c tgccagtagccaltgtg t gcatgg a'rtt gggtg ct g gc c tcacc 

:tttc 



TT C ATCTGC A AG ATGGGTGCTG C TG GCACC T CC T CCCCGGTGCTGTGGTG 
20 ACAGGGCATAGTGTGTGAGGCTGCTATGTGAAGCACCTAATGCAGGGCCT 
GGCATATGGAGGAATTCAGCAAATGACAGATGCCTTCACAGTTAGTTCCT 

GTTG TAG C T GT G GGGCATTGA G GACAGCC T GG A TTGTTCCACAGAGCCCT 
GAGGACATCl'CCAGGGGTGTGCrCTGCAGGGGCAGCTGGATTGGAGGGTT 

25 AGGGGTCGGGGAGGGCGTGCACTCCCACCCATGCTCACAGCCTCGGAACA 
G- TGCCTGC T CAGCCAACA T GGGTG TTT GAT T C TGTGTCTTTTGTC A C A G A 
CTTTATCAGCCCCATCCCTTTCTGACCTTGCCTCAGTTTAAATTTTACAT 
GTGGG G CC T CATTAAGAGACATGGTTCTTAACTAAAGATCTGTATCCATT 
AGGAATGCTn'GGGCTGCAGGAAGACAAACACCTGAC TCACTGT G GCAT A 

30 AGTGGTTTGCGTCTGCTCCCATAAGCTGCACGTGGAGGGTGG ATCT GG CA 
TTACTC r CTCTTCCC T AC ATITC CAGTATGCrrAACAGCTTTAACCTC^ 
CCTTGTTTCTTCATGGTTGCAGGGTGGCTATCACAGCGCTGGCCATCACA 
TCCT rACACAG CT G T GTT T ACAAATTTAGGGGGACA T TGAA GCTCCTCCC 

35 ATGCCCAACTCTAGACCGATCATCAGTAAGAGGAGTATAGAATTGCTGTG 
CCCACCaAaATTAATCATGGCGCAATGTGCTCCCCATACCAACAAAATC 
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TGAGTTCTAGAAACTGAGGAAGAAGAGGAAAATGGCCGTCTTGCCTCCTG 
GCKK5£APX ^GA G C ATCTC 

GGCAAAAGTGTGTGAGTTTTTGTGGAATGGATCCACGGCTTTTATCAGAG 

5 TTITCTeACCACTGCCCAGAGATAACCAGCACATTAACATOGCCTTTTCT 
CCATGAATAGCACTAGGGTGCCCAGTGGACAGACACATAGCTGTCCACAC 
ACCAGCTTGCTGGGGATGCATAGGCAGAGTCACATCTGCACTCACGGCCr 
GTCCTCACACTGCCATGTGGAGAGCCAGCAGCCACACCATGGGCCGTCCA 
TGCTCACGGGAGTGGCAGTATCAGATCTGAGCTTCGTGTGCCCAGGCGTC 

10 TCTCACATCAGTGCATAGGGACCCTCTTTGTTCTGTGGCCCAGTGTGCCC 
ATGCCACAGATGGCTTCAGTCAGCAGACACCTCCTTCTAGACACTCACAC 
TCACTCCTGGCTGGCCCTrAGCACACCTGTGCAGACAGGCCCATTTA'rrr 
TCTTGTGTAAATCCCAAGTAGGAGGACTGGGTCTCTCTGACAGCAATGCC 
AGCTGCCTGGCACCCTCCAGACAGGTGGCTCAAGCCCCACCTCGCCAGCT 

15 CTCCCAGTTAGCCCCTCCTTrCCCTGGCTCTGACCTGAGGGACGAAGCAG 
GGTGCTACAGGACGCTGTGCCACAGGGATATCGTCAGG GA CAGAAG GTAC 
^TGCaTCTGCTGCTCACCCCTCCAACACGCrrGTGGCKrrGCATTTGTTG 
AGTGGCTGGTACCAGACTCTGCTC^rTCTGACTTTCCAGCTGGTTTTACCT 
GTAGTAAAG TT TGAGAAGATG G GT C ATC C TGACCCCGGGGTCAGAAGACA 

20 GAAGGAG GCCCATGGCGTGTGGGGGAGATGCCCCGTGAGGCCCTCGGTGT 
GCAGATGC C TG GTGACAGCCCCACCCTGA GGTCCCC AGC CT ACCCCCTCC 
C C AGCCCGACTGCTCCCATCCCCCTCCCTGTGCAGGTAGAGCAGATCCTG 
GCAGAGTTCCAGCTGCAGGAGGAGGACCTGAAGAAGGTGATGAGACGGAT 
GCAGAAGGAGAT G GACC GC GGCCTGAGGCTGGAGAC C CA TGAA GA G GC CA 

25 GTGTGAAGATGCTGCCCACCTACGTGCGCTCCACCCCAGAAGGCTCAGGT 
A CCACATGGTAACCGGCTCCTCATCCAGAAGCAGCTGTGGGCTCAGCCCT 
AGCTGGGAGAAGCACCCCAGGCACTCCCAGACTCACAGCCAGCCCGAGAC 
AC ^ATCTCCTGGGGAGCAATGAAGTCCTCGACTTGGGCCAGTTCTCACCC 
ITGGCTCCirTGGTCXGGCCCTGGGGCACTCGGG Cr CACC C TGGAGC KlG 

30 CAAACCTCAGGAAAACTGGCGTTTTAAATCTCACTCCTGGCCAGGTGCAG 
TO GC TC AC C CC T GTAACTTCAACACriTTGGGAGGCCAAAG C AG G C GGATC 
TC'rTGAGGCCAGGAGTTTGAGACCAGCCTGCCCAACATGGTGAAACCCCG 
TCTCTACTAAAAATACAAAAATTATCCAGGCATGGTGGCACATTCCTGTA 
GTTCCAGCTACTCGGGAGGCTGAGGCATAAGAATTGCTTGAACCCGGGAG 

35 GCCGAGGTTGCAGTGAGCCAAAATCGCGCCACTGCACTCCAGCCTGGGGT 
G A C AG GGTG A G A C A CC ATCT C A AAA AAA AAA A A A A A A A A AG ACCTC ACTG 
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CTCCCCATGGGCACTTAGGGAACTCTCCCAGCCCAGTTCTGCAGCTGGGC 
GATT-GC ACT AG ATCCTC A GTTGGT C CCT GG GCTgDSGGTGAGTO3-CCAGG 
GCAGGAGTTTCCCATTGACTTTTCCCTGGTTGACCT1TGACCCC r rTCCAC 
AGTT GACACTGGTGTCCCCAGGTGTCTGGTGGCCCCTTGTCCAGCTCCCT 
5 TAGTCCCTr GTGCCTTC CCTCCTC CTCTT'fGT AA T A TCCGGGCTC A GTCA 
CCTGGGGCCCACCCAGCCCAAGGCCAGCCTGTGGGTGTCCCTGAGGCTGA 
GACAClTCT CTCTGTGCClTTAGAAGTCGGG GACTl^CCT CTCCC'rGGACC 
TGGGTGGCACTAACTTCAGGGTGATGCTGGTGAAGGTGGGAGAAGGTGAG 
GAGGGGCAGTGGAGC GT GAAGACCAAACACCAGATGTACTCCATCCCCGA 

10 GGA C G CC ATGACCGGCACTGCTGAGATGGTGAGCA GCGCA G G GGCCGGGG 
C AGGGGGCC A AGGCC ATGC AGG ATCTC AGGGCCC AGCT AGTCCTG ACGGG 
AGGTGCCAG C TGTCTACCAGGGGTGGGGAGAG C GGGGGCl'GGAGGACCAC 
CCAGCCTCAGAGGCAGCTGGAGGCCTGGGTGAACAGGACTGGCCAACATG 
K'CCCAAGTCCCACAGTCACCATCTGGCCAGCATTGAGAGGGGAACGGGC 

15 TGAGGAAGAGTTAGTGGCAAGAGGAACCCCAGCCAGTCACACCTTGTCCA 
G3-TTACCAGA GG A AAA ACC A ATGTGT A AG AAC A G A A A TG T G A C C C G GCAG 
CCAGTGCACTGCCCCCCTCTCCAAAGGCCACCCCTCACCCTCCACCAGCA 
TGCACAGAAAGTGGGGTGACAGCAATCACAATGTCTACCCAGGCAGCAAG 
GACeGCTGAGCATOGGGAGC^ 

20 GAGGCCTAGGGGGAGTTGGGCAAGGCCAGAGCCCTAGCTGCAGCCAAGCA 
CATGGCCAAGGCCAGCTCCTGGAAGGGCAGGGCTCCGAGGCAGGAGGCAG 
GAGGCT G CCCGTGGCTACCCGTCCTCACACCCCTGCAGCTTGCTAGTCTG 
TCTGTGGGCTGGGTGTGAATCAAGGCAGTGGGATGGTGTGGGGACCTCCC 
TGGCCCCAGCAGCCAGTGAGGAGCCTGGTCAGTCAGCAGAGCATTCAGCA 

25 GTATCCAGTTCCATGGAGAGGCCCGTGTGAGGGGAGTCGGGGCTGGTCTT 
CAGTAAGGATGGGTGGCCAGGGCCCCTAGAAGTAGAAAAGGAGACTCCGG 
GTGCTGGAGACAGAAATCAAGGATGTGCCTCCATGTGGAGCCTCAGGAAT 

GGCTGACAGAGCACAGCCGGGCCAGGGACCAGCCTGCCCTGTGTTGCCTT 
30 GTC CC GAGGGCCACTGTCAGCAGGTCTCTGGCATGGGGGAGGCTTAGGGC 
CTGAGC C C AACA AGCAGCAGCGGAAGAGGAGAGGGAA ACT G T GGAC AGGC 
CTGGCATTCAGTGGCCAGGTGTTGCAGTGTCCCTGAGGAATAGCTTGGCT 

TCAC CATGGG GTGC ATCTTCC A GG3'C TTCGACT A C AT CTCTGAGTGCATC 
35 TCCGACTTCCTGGACAAGCATCAGATGAAACACAAGAAGCTGCCCCTGGG 
OTCACCTTCrrCCITOCTCTCAGGCACGAAGACATCGATAAGGTGGGCC 
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GGGTGGAGGGGCAGAAGGCAGATGAGGGGAGGCACAGGCACCCCAGAGGA 
A CTCTGCCTTCAAATGTAGCCCCCATACCATGTGCTCAGAAGGGAGATCT 
GGATTCAAATTGTGGCCATGTCACCTGCCACCTCTAATGCTGTGGAAAAG 
AAGCATCACATTAGCTAATTCTGGCTGTGCGCCTTGTGAGGCAC C A G CTA 
5 TGATCACCCCACTCCAG TGG A AAG A GC A GCT GGC AGT A GG G TGG G G CTCA 
A ACTC AGG C AGCCGGGCTCTGGGTCACCTGCAGGCCACGGTCATGTCACA 
CrG CCTCTAGCT GAGTCAGAAATGTGAAGGAA CT GA GAITCTACCCrrCC 
TGCAAGCTAGCAAAGTGGCCTGCCAGTTACATCTGTGCATGCACACACAC 
ACACAGTIATATATGCACAGAGA^^^ 

10 GAAAGCCAGATCCTCACTCACGGCAGAAGCAGCAGCCAAAGCAACATCTC 
ATGTGGTTTTCCAAGCCCCAGTCCCTACAGAGACAGAGAGGGCCAGGTGG 
CACCTGTGCATGCAGCGGGGTACCTrGCAGGAGGGAAATCCTGATTITAC 
ACAAAGCTGCTCCCCCCACGCCCTGCCTTGACTCTGGGATGACGTCTCAG 
AGCTGTGCAGTACAACATTCTTAAATTGGCTGGGACTCAGCCCTGCAGAA 

15 ATATG ATATCTTC A AGG AG AATCGTTCCCA A AACCTCTC A A AGCTATGGG 
GCTGCTCTGAGCCTGTTTCCTCAGCTGTAAAGTAGGGTGCATACTTTTAT 
GGCCCTGTGCAGGAGGTAGTGACAGGCCCTAGCACCCTGCCTCCAGTATA 
TGTTAGCAGCCACGAGGCCTATCTCTCCCCACAGGGCATCCTTCTCAACT 
GGAGCAAGGGCT^XrAAGGCGIGAGGAGGAGAAGGG^ 

20 CTTCTGCGAGACGCTATCAAACGGAGAGGGGTGAGGGGGCACCTGTACCT 
GC CGGGGGGC iCTGCC C TGGG CCACCCACCCCAGCACTGCCTGCCTTTCTC 
CTTGGCTTCC AGC ACTGCAGGTTCTGTGCTTCTTGGC AGG ACTTTG AA AT 
GGATGTGGTGGCAATGGTGAATGACACGGTGGCCACGATGATCTCCTGCT 
ACTACGAAGACCATCAGTGCGAGGTCGGCATGATCGTGGGTAAGGGCTCC 

25 TTGCACCCCTGCCCCTTCCAGACTGCTGAGGCTCCCTGTGTACAACAGGC 
TTCAAGGGCCCTGTGGGGTGAGGACCAAACTACTTAACAACCGGTGATGT 
CAGAGCAGAGCCTGGTGCTACAGCCTGGGTGGTCTTGGGGTATCAAGATG 
GAy\GCACCGTGTACAGTAGGAAGCATTTCAACGCCATGATGCCACATTCC 
TGCATCAGATGGTATGCCAGCTGCATATCCACCTCACCCATCAGGATTAT 

30 A ATT A A A AC ACTTATCTGGTA A ATTG ACC A ACTGG AC AG ATTGGTCC AAG 




ACGGGCCTGCCCCTGCCCCTGCCCCC ACCC A A AGTG A AGGC AGGTACC AG 
GAAA G GGAGC A GCAgCGCCt C C C C TCCCAGCAGA G G G GTCTrCCACACCAA 
CT-C^ACC^KjT C AGA AGTTC C GGAGG TCATTA T - A AC C A G CCTTCACTG 
35 AGGAGCAATCCAATCAGATCAGTTATCTGCTGTGCGCACAGCCGTGTGGT 
TCTATA CTrC I^ CTr A C TTCCAl T'I TCAC CT T r CA G AAGGAACGlTGTCTT 
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TAA^TCCAGCATCTAAACGTGAGCCCCAGCCATCCCTGGCTGTGATCCCC 
CCAGCCC1TTCCACCCTATCCTCTGGAACTGCCTGGGGCTCCCCAAGACA 
CTTCCACATGAA r rTCCCACCAAGCCAAGCTGCAGCTGCTGGGCCCAGGCA 
TAACCCCTCCTGGGGCAGAGGTGGCAAGGAGTGACCCACCACTCACATCT 
5 GCCCCACATCCACTCTTGACTCTGCTCAGTGTTTAAAAACATGTTTATAA 
CAATTACCAAGATCTGAAAATTAGGAGAATTCACATCAAAGTCTGGATTT 
CTGTTTGTTCATAAAAAACTAGAAGGCAGCCAGGCAAGGTGGCTCACGCC 
AGTAATCCCAACACTTTGGGAGGCTAAGGCAGGCGGGTCACTTGAGGTCA 
GGATTTGAAGACTAGCTGGCCAACAAGGTGTAACCTCGTCTCTACTAAAA 

10 ATACAAAAATTAGCTGGGTGTGATGGCGCATGCCTGTAATCCCAGGTACT 
CAGGAGACTGAGGCAGGAGAATTGCTTAAACCCTGGAGGCAGAGGTTGCA 
GTGAGCCAAGATCACGCCACTGCACTCCAGCCTGGGTGATGGAGTGAGTG 
AGA.CTCTGTCTCCAAATAAATAAATAAATAAATAAAAACTGGAAGTCTAA 
GCATCACTGAGCCCTGATTCCTATGTGGCAGCTCGACTGACCAGCATTTG 

15 AGTTGCTGTCCCTGACAGCTTTGGGGGTGTGCAGCCCACACAGTCATGCT 
AG CTTGAGGCTCTGCTGTCAGCAG T TTGAAACTCTTAATAACTTGTGAAC 
AAAAGACTCCATGTT G TCACTCTGCACAGGGGCCAGCAAATTACAAAATT 
CCATATCCGGAATrGTCTACAGGAGCCTCTGGGCTGCTCCCAAGGGCCCA 
CA C CAT GCC TTA CT CACnT G G GTTGCCATCCAAACATGTCTCATGACAA 

20 AGAAGCTCAAACATGTGCATGGACAGTGCCAGAAAACAAGGGTCGTACAT 
AG AC A A A ATA A A ATG ATAACGTCCC AC A ACC ATTTCTTTG AT AC AC ACTG 
TTTCTCTCAGTCCTCCCAACCACCTAGGTAACAGGCAGGGAAGGTGTTAC 
TGTTGCCTGTTAGG A A AG AGG AC AGCCCTGAA AGCTGTCCCTGGCC ACTG 
AAGCAACCCAGGTCTTCCAGCCCCAGGGAGAGCCGCCTriXXAlTGTTCC 

25 AGACAAAGCAGAGACAGGCATGGGGGAGCGGGAGAGGGACTCCTGTGGGC 
AGGAACCAGGCCCTACTCCGGGGCAGTGCAGCTCTCGCTGACAGTCCCCC 
CGACCTCCACCCCAGGCACGGGCTGCAATGCCTGCTACATGGAGGAGATG 
CAGAATGTG G AGCTGGTGGAGGGGGACGAGGGCCGCATGTGCGTCAATAC 
CGAGTGGGGCGCCTTCGGGGACTCCGGCGAGCTGGACGAGTTCCTGCTGG 

30 AGTATGACCGCCTGGTGGACGAGAGCTCTGCAAACCCCGGTCAGCAGCTG 

GCGGCAGATGGGAGCCGGGCCATTGCAGATAATGGGCTTGTTTTTAAACA 
ACTCTGGGGAAAAGCAAACTGACAATCCGTTCGTAAGCTCCATCCCTTCT 
GCTCAGTCATGACCTGCCCCTGTGAGAGATGAAGGGTTAGTCCCAGTTGT 
35 GATGTGATAAGCCCAGACCTCTTTCCTTCCGACAGGTGATCGTGCATGCA 
GAGGAGGCTCTGAGACIGCX'CCCAGCAAGGrrCCTGGGTrrAACCCAACAT 
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TC . CCCAAAGTATGTATTTGGCCACATTCACAGAAAGAATATTAGTCTTTT 
GTGGAATGCTGCG GGITG A C A GT CACAGC TT G G AAACCAACCCACAG AG A 
GCTCATCATTAATCATGGCTATCACTTGTTTACCACCTACTGTGCCAGGC 
CTATGCTAATrACT T TATTAGCGTCCTCTCTGCCGCTCGCAGGCCTCTAT 
5 TATrATAGgKAGTAGTATTCGATTTATITAAATTAAATACG G AAGGTCA 
TAGATTAAGCAA GAAAGTG CCAGCAACA TG G TGC GTGCCTCTGACTGGGC 
ACTAACCCTCCAAGTC'ITAGllTTCCCAACCATAACTGGC C AATGAACAG 
CAGCTCTGGATGCAGCTAAAGGAAGACTGAAGCTGTAGGTCCCGTGCTCG 
G CGCAGGGCCCCCTGCAAGGA A GG TTTCG GAGGGACTGGATGGGGTCTTT 

10 GAACTATCTGTCTTTCCCTTTACTGCAGTGGGCCCA G GGGC A GGCCAAAG 
TrGCTCCCGTGATTGACTTGAACGTGCACGTTCCTAATCCCTGACATTTC 
X AAAGCTCTGGCTCArrAACGAGGGAAAGACGTGAACCAGCTGGGGGAGT 
GGGGATCGCAGTGCCCCACGTGGCCGCCTCGTGACCTCAGTGGGGAGCAG 
3 X?>GGGCCGGCTCCCGGCTTCCACCTGCATCtAGGGGCCCTCCCTCGTGCCT 

15 GCTGATGTAATGGACCTGCCCTATGTCCAGGTATGAGAAGCTCATAGGTG 
GCAAGTACATGGGCGAGCTGGTGCGGCTTGTGCTGCTCAGGCTCGTGGAC 
GAAAACCTGCTCTTCCACGGGGAGGCCTCCGAGCAGCTGCGCACACGGGG 
AGCCTTCGAGACGCGCTTCGTGTCGCAGGTGGAGAGGTGTGCGGAGGAGG 
AC^G-PGGGTGGAAAGGGGAGGGGCIGGG€*A^ 

20 GGTCTCAGGGCGACGCTGAGTCCCAGGCCCGGGGCGCAGGGATGGGAAAC 
TAGGGCCTGGGGCGGGATTCCGGGCGTGGGCGGGGCCCGGGGCGGGGCAC 
AG GGGGCG G GGGAGTGGGCGGGGCCCGAGGCCGGGCGCTGGAGGCGAGGG 
CGGGGCAGGGACGGGTCCAAGGGCAGGAGGCTGGGACAGGACGGGGATGC 
AAAGGGAGGGGCGGGGCCCGAGACGGGGAGGAGGGGGAGGGCCCAAGGGG 

25 AGGAGGCGGGGTCCGGACGGGGATGCCAAGAGCAGGGATGGGAGCGAGCC 
T-GCG- TCC G GG C AC T G G TCCCCATCCGTGAGTCCCCTCGGTGCTCCCTGCC 
CGCCGTGGCCATCCTCTCACATCACTCACAACCCCAAGGCGCGGCATGGT 
TGACACCCCCACGTTAGGACGGAGACCCTGGGCTTAGTTAGAGGGGGCAG 
TACrAACCAGTCCCTGGCGGAAACGCTTTGGCTGGGTGAGGTGAGCGGGA 

30 TCGCCCCCAT1TCTCCAGAGAGGGGTCCCGGCTCAGCGAGGGAAAGAGGC 
GCiCCGGTGGGGGGAGGGGTGGGC^GGGGGGGTCCX^TGGAGAACXJAGAGGG 
CGCCGCTGGAGGGGGATGGACTGTCGGAGCGACACTGAGCGACCGCCCTA 
G CTCCTCCC G CC C CG C A GC GA C ACGGGCGACCGCAAGCAGATC T ACAACA 
T C C TGAGCACGCTGGGGCTGC G AC C C TC GACCACCGACTGCGACATCGTG 

35 CGCC G CGCCT G CGAGA GCGTGTCTACGCGCGCT G C.GCACATGTGCTCGGC 
G GGGCTGG C GGGC GT C AT CA A C GGCATGCGCGAGAGCCGCAGCGAGG A CG 
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TAATGCGCATCACTGTGGGCGTGGATGGCTCCGTGTACAAGCTGCACCCC 
A GGTGAGCCCGCCCCGCTCTCTCCCTGGTAAAGTGGGGCCCAAAAAGCGC 
GCGCTCCAAGGTrCCTTGCGGTrCCCAAGCTCCAAGATTTCGTAGTCCTC 
TTCTCGTCCCCCTTGGCCTAGATTTGGGGGAAGGGTCGACTGCGTGCAGG 

TGCTTCTCTTCTGCCCAGCTTCAAGGAGCGGTTCCATGCCAGCGTGCGCA 

GGCCGGGGCGCGGCCCTGGTCTCGGCGGTGGCCTGTAAGAAGGCCTGTAT 
GCTGGGCCAGTGAGAGCAGTGGCCGCAAGCGCAGGGAGGATGCCACAGCC 
10 CCACAGCACCCAGGCTCCATGGGGAAGTGCTCCCCACACGTGCTCGCAGC 
CTGGCGGGGCAGGAGGCCTGGCCTTGTCAGGACCCAGGCCGCCTGCCATA 
CCGCTGGGGAACAGAGCGGGCCTC'ra 

CCCAGGGCCCTAACGGGGGTGCGGCAGGAGCAGGAACAGAGACTCTGGAA 
GCCCCCCACCTTTCTCGCTGGA4TCAATTTCCCAGAAGGGAGTTGCTCAC 

15 TCAGGACTTTGATGCATTTCCACACTGTCAGAGCTGTTGGCCTCGCCTGG 
GCCCAGGCTCTGGGAAGGGGTGCCCTCTGGATCCTGCTGTGGCCTCACTT 
CCC TGGGAACT C ATCCTGTGTGGGGAGGCAGCTCCAACAGCTl^GACCAGA 
CCTAGACCTGGGCCAAAAGGGCAGCCAGGGGCTGCTCATCACCCAGTCCT 
G G CCATTTTCTTG CC T GAGGC TCAAGAGG C C CAGGGAGCAATGGGAGGGG 

20 GCTCCATGGAGGAGGTGTCCCAAGCTTTGAATACCCCCAGAGACCTTTTC 




GCAGGTGCAAGAGACAGAGCCCCCAAGCCTCTGCCCCAAGGGGCCCACAA 
AGGGGAGAAGGGCCAGCCCTACATCTTCAGCTCCCATAGCGCTGGCTCAG 
GAAGAAACCCCAAGCAGCATTCAGCACACCCCAAGGGACAACCCCATCAT 

25 ATGACATGCCACCCTCTCCATGCCCAACCTAAGATTGTGTGGGTTTTTTA 
ATT A A A A ATGTTA A A AGTTTT A A AC ATGGCCTGTCC ACTGTTCTTTG ACT 
TCTGTGCATTAGGACTGTGGGGACAATCTATAAAGAGTCTGCGTCACATG 
CATGAAGACACTTCAGTATCTCGGCAATGCCCTCCAGACAGCTCCTCCAG 
CCATCTGTGCCAAGGGGAGTGTGAGGAGTGACAGACCAGGCTGTAGGAAC 

30 AGGAATGGGGTGTCATGGGGGATGGCAGAGCAGTGGACAGTACACTGCCT 
GGCCCCKjGCC C CTGCTTGCCTGCCCATGGAATGTGTGCAGAG G GAG TG C C 
AGGCCAGGTGCTGCTCTGGAGAAGTGGGGGAATGAGGCTGGTCCTGCTGC 
AGGTGACTC^AGCACCGTC^ 

35 TCCTGTCAGTGGAACGCCTATTTAGAGTTAGCCAAGCGTAGGCATAATGC 
CATCriTCTGCAGCATAAAATACAGTGACATAGAAACATATTTGTGTGAT 
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TTrCATGCATTCCTTTTTTGATGAGAGATATTACCCAGCTAATTAGGAAC 
A ^CTGTTTTGTTTCCTTCAGATC AT AAC CC A A AGTTGTG ATTTTGA A A AG 
TCATGTCCCCCTTCAGATTTCTTGTTTTCTGCTACTTCTCATGTGGAATT 
GCTgPGGCJCCTCTTAGTTCTCTTGAGTCTA 

ACAGAGTGGAACACATTTTAATATCAATTGCTTTCTATTTCTCCTTTATA 
TTACAGTTCAGGACATTGTATTAATTATTAAAATTCTATTCGTAGGTAGG 
TT AT ATG ACTG A ATTG A A AT AG ATA A A ATG A ATTTCTTTTCT AG AT A AC A 
AAGGAGGTGTCATAAAACACTTGTTATGGGCCAGTGTGATGGCTCATGCC 

10 TATAATCTCAGTGCTTTGAGAGGCTGAGGTGGAGGATTGCTTGAGGCCAG 
GAATTTGAGACCAGCCTGGGGCAACATAGCAAGACCCCATCTCTTAAAAA 
AAAAAGGGTGGGGCGGGGGGGCACTGCTGGGCGCGGTGGCTCATGCCTGT 
AATCCCAGCACTTTGGGAAGCCAAAGCAGGTGGATCAAAAGGTCAGGAGT 
TCGAGATCAGCCTGGCCAACATGGTGAAACCCCAACTCTACTAAAAATAC 

15 AAAAATTAGCX-GGGCATGATGGCGGGTGCTTATAATCCCAGCTACTCAGG 
AGG 47rGAGGCAGAAGAATTGCTTGAACCCAGGAGGCGGAG G TTGCAGTGA 
GCAGAGATTGCACCACTGCACTCCAGCCTGGGCAACAGAGCGAAACTCTG 
TCTCAAAAATGAATTAATTAATTAAAAAAAGAAAAAAAAAACACTGGGCA 
GGGTOGTCT-GCAGGT-GT-AGTC 

20 GAGCACTTGAGCCCAGGAGGTTGTCTGCAGTGAGCTCTACTCATGCCACT 
GC ACTCC AGCCTGGGTG AC AG AGCTC AGTGGCTT AC ACCTGT A ATCCT AG 
C ACTTTGGGAGGCTGAAGCAGGCAGATCACCTAAGATCAGGAGTTCGAGA 
CCGGCTGGCCAACATGATAAAACCCCGTCTTTACTAAAAATAAAATAAAA 
TAAAAAATATATATAAAAATTAGCTGGGTGTGGTGGCACATGCCTATAAT 

25 CCCAGCTGCTTGGGAGGCTGAGGAACAAGAATGGCTTGAACCCGGGAGGC 
AGAGGTGGCAGTGAGCTGAGATCGCGCCACTGCACTCCAGCCTGTGCGAG 
AGTGAGACTCTGTCTCAAAAAAAAAAAAGGGAATTTAAGAAATTTAAAAG 
AAAACT^TG TTATA T- AAAAAGGGTATTGGGTCTGACAGATAAGAGCTCC 
TOCACPeTACCAGCCAGCTACTGACAGACATAGGTCTGGCTCCAGTGGAG 

30 GGGCAGCAGCCAGTGAGCCCAGCCTGGGGTGGCCCACTCCTGCTGCCTCC 

GCTGGCGAGACTGCTTCTCTGGAACAGCATCACGCAGGCCTGCCCATCGG 
CCCACTGTGCACCAGGCCTTCTGGGGATACAGATGTCAACCAGGTGGGGT 
GCTCAGGAGGGGCACAGAAGCCAGGAATGACAAACACATCAGCCACCAGG 
35 CAAATGGGAAATGTGCCCCAGAAGCTCCCTGCTGAGGATGTTAGGGAGAG 
CATTCTGAAGTAGTGTGGlTGAGATGAGGCrrGAGGAAGGCAAGGCTCCA 
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AACAGCAGGGCAGACTGGGAGCAAGGTAGACTGCATGGGAGGGCAGCTGA 
TOGAGCTCCTrAACCCTCTGGAATTGCCCCAAAGCCAAGCAAAGTGTTCT 
TCTTGGGGTCACAGCTAGCTCAGGGATGCCTTCTGCCCCTTGGTCAGAGG 
GGCAAAAGGTCAGAGCCTAGGGTCACCAAAACCTCTGGGAAGCCCCGGGG 
5 GTCTCAGGCCACAGACCATCCTCAGAACTACACACTGCCCTCCCATGCCT 
GGCGGGGGCCCTGGACTGGCCCTCACCAGCTGTCTTCTTGCACTGGCCAG 
GGTTCTGGCTGGACTGGCAAGGAGGGGTGGTCAGATACAGGAGTAACTGG 
ATCCCTTC ATC AGG ACCTAG GGTG GTG AG AGCTTTG AG CCTGCTCTGCTC 
CAGGCAGACATTGTGTCTGGCCCTGCCAGGATGGATAGACAGCAGGATGT 

10 TACACGTTGAGGACATGAAGGTCATCAGGAATGTGGCTGGAATCTGTTAG 
GC'CTCCCCCAGCCCAGGCGGGGGCTGCCAAGTTTGGGCCTATCCTCTGTT 
C C TCTCCrr MlTGGACClTCAGGTGATAAGGCTGAGACATAAAGGAGGC 
TGGGCCCTGCCACCACGACAGCAGCCACACCTCTGCAGAGAGAATGGTGA 
GTGCCTGCTGGGGAAGAAAGGCTAGCGGTCTCCCAGGTGCTGGCCTTTGG 

15 GCTGGGGGAGCAGAGTTTTCTGTGCTTGTGTTGGGTTGAGGGTGGTCCCC 
AGGG AGAGGAAGAGGATCCTGGCCCTGGCTCTCCTGGGAATGCTCTGGGA 
CTGTGCATGATGGGTGGGGTGGGGAGACTCTGAGGAGTTGGGGAGAGGAC 
CCCTCCCTACTCACAGTGTTGCAGGCCAGCAGGAAGGCGGGGACCCGGGG 
CAAGGTGGCAGC,GAeCAAGGAGG€CXAA«^ 

20 CCATGTTTGAACAAGCCCAGATACAGGAGTTCAAAGAAGTGAGTGCCCAC 
TCCCAGTAGCCTCAGATCCCATCCTGGCCCCCCCACCCCACCCCACATAC 
AT-ACG CCCCTTCTACCCTGACCTTGCCTCTCACACCACCCAGGTCTCTCC 
CCCACCTCCCACCTTCCCTAGAGCTGGGGGCTGCTCCCACCTGAAGGCCC 
CCATCCCACAGGCCTrCAGCTGTATCGACCAGAATCGTGATGGCATCATC 

25 TGCAAGGCAGACCTGAGGGAGACCTACTCCCAGCTGGGTGCGTGCACCCA 
CCTCCCACCCTGCGCACTGGGGTCCCTACTCTGAGCTGCTGGGCGGGTGG 
GAGTGGCTGGGGGGACAGGACTCTGCTCCCCTGCTTCCCCTCCTCCCCGT 
CTCCTCACACirTCCCTTCCCCCCTTGTCACGCCTTGCTTCCACTTCACCT 
TCCCGACCCACAGCTGCCTCTGCCCCTCCAGCCCCTGTGGCCAGGATGGA 

30 GGGAGGGCGGCCTGGGCCTTCTGGGGGACACCCAGGGTCCCTGTGTGCAC 
CIX^-KjCC^GAGC^GGAGGAGGGAAGGTGACT 

gacgccatgctgcaagagggcaagggccccatcaacttcaccgtcttcct 
g acgctctttggggagaagctcaatggtgagcctgggacagagctgggca 
gggtt ggc c aggcagggagcctgcaccctgcctgaaccccacctgaacc g 
35 tgcctgaaccccacctgaaccttacatgaaccccacctgaaccctaactg 
aaccccaccl xjgacccacctggactcti'cxtrggccatgacccaltccaag 
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CACATCCTCTGCCCCAGAATCCCATGTGCACTGGTCACCCCAGTGCTGAC 
T T GG AGCCAGGAAATGTGCCTTCAGCCCCCACCCCCAAATTCCAGTCTCC 
CAGCCAAGCTGCCCGCCTCAGGAGGATGACCATTCCCAGCCCCACTGATC 
CC CGAGAAACATT T TATG T TAGGGAATACCCCCACCTC T TCTGGGATGTG 

GAGACATGGATG CTC ACCTGGCTGCCTCGGC CTTCCAGGGACAGACCCC G 

GTGGTGAACAAGGATGAGTAAGTATGGGCCCAGCCAGATGAGGAGCACCG 

10 GCTTCGGGG C C T TAC AC TGCTCTTTGGGGTGCAGCCAACCCTTCCCTGCG 
CCATGGGAGCCTCCGTACCCACCTTCCCTGTGCAGTCACTCCCCCGCAGT 
rTCrT GC TC A G A CCCT C CT CACCCCCCAGGTrCAAGCAGClTCTCCTGAC 
CCAGGCAGACAAGTTCTCTCCAGCTGAGGTGAGGCTGCCCAGCCCCTTCA 
AT AC T CA TCCCC AGCACCTTCTCTGGGCCTTCACCCATGACCCAGAGCCC 

15 AGTACCAGTGAGGCAGTTGCTGGAAGGGTGAGCCGAGGGCCCTTCTGGAG 
&AGG3^3CCA3€ICTOTTO AGA C CTAGAG G GTAA A GATG T GGAGTCAGAAA 
AG AG GGCAGGGT GCGCCAGGCAGGGAGACTGTGCACAGACCTGGGGGGAA 
GTGGATAGGGAGAGGTTTCGTACACTCGGGGTGGGCCTGTGCCTGTGGCT 
GGAGGGGCGTCCTT T GC CT C TT GGC C CACATTrGCACTGACTC C TCACTC 

20 TGCCCAGAGTCAGCCAAGAGAAAAACATTAACCCAGAGTCTGGGGTCTAG 
GGTT G AAAAG C TAAGGCAAAAAGCACAGATGCAGGGGGCAGACAGAA A GG 
CjCACAGG ACT CAGGTGAGGTCTCTGCCGGGC T GGG CCAG GAG C G AGGGGA 
CTGCCACTCACCAGTGTCCCCTGCAGGTGGAGCAGATGTTCGCCCTGACA 

25 CACCCATGGAGACGAGAAAGAGGAATGAGGGGCAGGGCCAGGCCCAGGGG 
GG G GCAC CTC AATAAACTCTGTTGCAAAATTGGAATTGCTGTGGTGTCTT 
GTCTGTGACAGATGGGTTGGGGACCAGCCAAGGGGGATCCCAGGGTCTCA 
GTGCGCACATCACCATGATCATGGCCACCATCTACCTCCTGGGAGCTGGC 
CCCTCGCG AGCTCACC TrGATTCAC T CCC AT GATGCCAAGTGAAGTGTGA 

30 ACTATGATCATGCCTAGTTTA CA GATGAGGACACTGAGGCCCAGAAAGTG 
TGAG C ATC r TACCAAGGC C AGCCCTCTAGAA G AGGAGATGGTGGGATTTA 
CACCACCTCCACCAAGCCCAGGAATGAGCCACAAAGTGGGCACTGCCCAG 
CT A C TTGGG G CTGTGCAGAGAAGAGGCTGCTTG CrGGG CA CTCA GC A AA C 
IG T GCCC AACAGCC CAG C G GGTGGGCAGCAGCC C TGGGACCCCCACAC CC 

35 AACCACACAGCCTCCCCTGGCCCACTGCTCGCACCCCATCTCAATACACT 
GGCTTGGGTGCCrCCCTGCATGGGCCCTTTGTGAAAGGCAGAGAGGTACC 
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CATTTGAAACACAACCAGCTTCTCATTGCAAATACAGGCAAGGCACTAAG 
ACATGAGGAACATGGACACC A AAGCAGGGGC C AGGTAACATGCAAATTTC 
TAGAGGAAATGCCCAGAACCTGGCATCATGCCTCCTGAGCCCCTCATGCG 
CC GTGAGGGGTAAGAGGGTCAGACAGCT GG AG TGT AGGGAGACGACTTCT 
5 CAGGAGAGAAT A GTr AGTGCTCCCGTCACCCTrCATCTGAGAACCCAAGA 
GCTAGAGGAGAAAGTGATCCTCATGAGTACCAGAGGAGCAGCAGGGGACA 
TCCAAAGCACCAGAGAGAGAAACAGAGACAGAGAGACAGGCAGTGACAGC 
TCA AA C CTC AG C CAGATCCAGAGCATACAAAGTCTCCTGCCTACAGGACA 
GCCCAG T AAGAGC T CTCAGC TTGCCTCCTTCCCTCC CCACAAGCCCTGCT 
10 G CAATCCCT GTACCTGGGGGTCAGTGGGAAGGAGGTGAGCGAGAAAGGAG 
GGGCACCCCTTCCTGAAGGCCCCAAGAGGAAAGGCGTTTTCACCCAGACA 
CGTGTTCAGTnTTGATTTT-AJCTOGCGCCrGCK^^ 

GAAACTTGA GA C TTTCTGGAAT T ATGGC AT TTTCTGTTGCTTAGAGAGAT 
TAC AA AA GTCAC GAAC T GCCTGAGTTTCCA TCCTGA A A GCAGG C CACCAG 
15 CCCACTCCACTGACCATGCTGGAACAGTGGATGAACAAAATCAAGTACCA 
TT AGGATTCTAC C A CATGAGTCTG CTTG T^ 

G TAAGGG ATCAT GTTATAATCCAAGCTCTACAGG G GTAAA1TGTGAAAGA 
CTAAAATGAACCAAAAAGATCATAGGTGTCCAGTTATCTGATTTGATGGG 
GTOTCTGAACCT^ 

20 TTATTATTA TTTTTGA GACAGAGTCTCTCTCTGTCACC C AGGCTGGAGTG 
CAGTGGCATGAT CTCAGCTCACTGC/VACCTCCACCTCCCAGGTTCAAGTG 
ATTCTCATGCCTCACCCTCCCAAGTAGCTAGTATTACAGATGGGCACACC 
TTGCCTGGCTAATTTTTGTATTTTTAATAGAGACGTGGTTTCACCATGTT 
A GCCAGGCTG G TC TC GAAC TC C TG AC CT CC GTTG ATCCACCTGCCTCTGC 

25 CTCCCAAAGTGCTGGGATTACAGGGGTGAGCCACCGTGCCCTGCCACAAC 
TCTAAATTAJAACTAATAGCAAGGCAATGGTTCTTCTCTATTAACGTGCA 
A A TAAATGTTGTC CAGTGGAAGCACAACTGATTTTTCCCTTCTCTGTGGA 
ACiAAG€CAAJ^^TGCATCTA TTAAG < ^A ATTC AT CT GGGCATTCCTAA (^ 
GTCTACACATGCACCGGCrCTTTGAATTCTTCTCTGAACCAGGCCCAGGA 

30 ATAAGCCACAAGATGAGCACTGCCCAGCTCCTTGGGCTGTCACATCTTAT 

TTAGTATGCAAGTCAATATTTTGCTTTAAAAATATTATCCTTTCACACT C 
CTGATATAGTTGTCTGATAAGGTTAGTCCTTCCCACACCAAAACTGCCTG 
TATT AGT G TT G TT TGGAATAAACTGAGGGTAGAA T GTATATGGTGTGTGT 
35 ATGTGGTGTGTGTGTTTGTGTGTGTGTGTGTGTGAGAGAGAGAGAGAGAC 
AA AAGAG AG AGAC AG A AGGATAGAGAGAA AC AGATGGGC AC AGACCCAGG 
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ACATGAGTTCAGCCTACACTGACCAATATGACAGCCACTGGCCACTTGAA 

TGAAGATTTCATACAAAAAAGAATGCAAA C ATC TC ATTAATAACTTTTAT 
ATA GA T CACATG TTG A AAT GATAATGT TTT GGATATTAGATTAT T AC T AA 
5 AA1TAA TT T C A CC TA T 1TCTTTTCACTTTTTAAATGTGGCTACTAGAATA 
TTTAG AAT TCC ATAAGTGGC TTGCATTTCTGGCTTTCACTCCTGTTGGAA 
AGCACTGAGTTAGACTGTGTAGTACGTCTATTTAAGACTGCAGTTTCCAG 
GCCGAACACCGTGGCTCACGCCTATAATCCCAGCACTTTGGGAGGCCGAG 
GCGGGCAGATCACCTGAGGTCAGGAGTTTGAGATAAGCCTGGCTAACGTG 
10 GTG AAACCCTGTCTCTACTA A A Ax^TAC AGA A ATTAGCC AGGTGTGGT AGT 
GCATGCCTGTAGTCCCAGCTACTAGGGAGGCTGAGGCAGGAGAATCTCTT 
GAACCCAGAAGGG G AGGTT GC AG rG A GC CAAGATCAAGCCACTGCACTCC 

AAAA TA A ATA AATAAATAAAGACTGCA GTTT CTGGGAGACTCTGAGGCAG 

15 GCATTAGCCTTCTCTGCAGAGAGTACTTGCAGCAGGGAGCAGCAGTTTTG 
A TGTCCTCAAAAGGAGCCAATTTCATTTGGG TAGGGT T GCCT CT G AGTAT 
T CTAGCAGTACAGACAGAAAGGAGAGAAGG CT GTTTCCAGAAAGCAGAGA 
TCATACGAATTACTTGTGAGACCAAACTTGTTCCTCAGGTGAAGCTCAGG 
C ATC CCTT ATG T GGAGTG T CTAACAGTCTACACCTGAGGATGTTGGACAT 

20 AAGGGGGT G TG AGGT GGGCATGGC TGGGGAG AGCTCTGGGAGGGGGAAAA 
CCAGCTCCATGTTGTCCACCCACTGAAAGGAAAGCTCCCTCTGGGGGAGG 
^GATGCCCCCTGGCCAGGCCTOCAGGGCCCIGCTCACTGTGAGCCCTGT 
GTGGTCCTGGCCTG GGT C CC ACCAGCCAT TGCCAGGC AACAGCTCCCAGT 
TGGAAAACAGA GCAAG GC TC CCTC TTAGAAAAAAAAAAAAGAAAGAAAGA 

25 A A AG A A A AGA A AT AC A AC A GGT A AC T AA GC A T G ACGGCTC ACGCCTG AA A 
T CCCAGCTACTTGGGAGGCCAAGGCAGAGGATTGCTTGAGACTGGGAGGT 
TGAGGCAG C AGTGAGCCAGGATTCTGCAATTGCACTCCAGCCTGGGTGAC 
AAAGTGAGACCC TAGT AA A AAAAAAAAAAATAGAGACAGAGAAAGAAAGA 
CAl'GCAACAGGGCCAGGCGCAGTGACTCATACCTGTGATCCCAACACTrT 

30 GGGAGGCAGAGAAGGGAGGATTGCTTAAGACCAGGAGTGCAAGACCAACC 
TGGGCAACATGGCAAAAACCCATCTCnTCAA A AAATAAAAAAATTAGCCT 
G T TGTGGTGGTGCGCACCTATAGTCCCAGATATTCAGGGAGCTTGAACCA 




GA€AGAG€GAGACCT TGTG AG AA AGA AAAG AA AGA AG GG AA GGA AGG A AG 
35 GAGGGAAGGAGGGAAGGAGGGAGGAAGGGAGGAAGGAAGAATATAGGACC 
CAAAGGCCTAAATGCCCCTACTGTGCCCCAGTTCTGCGTGACTCAGGACC 
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AGCCTCCTCCACACTCCCACCACCACAACCCTGCACCCTACTTGTTCCTG 

CTTACCTTTCCTGTCT7TCTAG A AGCCTC AG A AGCCTTGCC ACTCTA AGG 
ACACCTCCATCTGAGCCAAGGCGCTCGCTCCAGATGTCCCAG.AGCTCCTG 
5 G TCCTGGGTGTCCCTGCCACACAACCCCCCA'IXjGAGCCCTGCTCTGGCIC 
AAGCCCCCTGACTGTGCATGAGCAGGCCTGTTGCCCTCACTGGGACTGTC 
CAGAGCCTI'CCCATCTCTCTGGAGGGACTTCCATCAGTTTCTGCCCCTTC 
TCCTCTGCCAAGAACTCACGTrCAGTCTGATAGCAGAAGAATCATCTGGC 
ACCCTCCTGAATGGAACCCAGAGTACCTCCTTTGTGGACCGGTCTCTGGA 

10 TTTTCCCCACTCTCTCCCTTCAGCCATGCTGATGGCAGAGAAGGTAAGAA 
CTTCCAGCCCACTTCTCTGGCGAGGGGAACTTGTCATCTGGGTCTGCAGA 
GAAGGTTCCACCTTATGCTCATAGTACATTATCTTTACTATGTACTAGGA 
TATCACATTTAAAAGGACAAAAAAGGCCAGGCAGTGGCTCATGCTTGTAA 
TCCTAGCACTTTGGGAGGCTGAGGCAGGTGGATTACCTGAGGCCAGGAGT 

15 TCAAGACCAGCCTGACCAACATGGCGAAACCCCATCTCTATTAAAAATAC 
AAAAArTAGCTCKdGTGTCG^iGCA-FGJXjCCTACAAJCCCAACTACTrGGG 
AGGCTGAAGCAAG A GAATCACTTGAACCCAGGAGGCAGAGGAT GCAGTGA 
GCTGAGATCGTGCCACTGCACACCAGCCTGGGCGACAAACCGAGACTCCA 
TCT CA AA A AATAA^AA^AATAAAAT A C AAC AAA A TA AAAGAAC AAAAAAA 

20 AAGAAATGTAAAATACTTGAAGGGGCTTGTATAA C A TT AATAGGATTGAC 
AGTATCTGCTTTCCAGGCTGAAGTGATTCATTCATTA TT CTAGA C G T CTT 
TAGTCCTTTGCAATTTGTGGTAATTAGGCTTTTCTTTTTAACATTAAAAA 
TATACAAAAATAAAAGGCAAAAAAAGCATCATCCCATTAGTCTGACCTTC 
CCCTCCTCCATCCCTGCCCCAACACCCTGAA G AC CCTG GATGCAAACAAA 

25 GGCCCG AGGG AGCCTCTTCCCTCGC AGTGC AGGCCTC ACCTGGGGCTC AG 
AGTCAGAATCTGCATTTTATTCCCTAGGACAACCTCTAGTCAGGGCAGAG 
GCCGGCTGTGCTGCCCAAGTGCCCTAACCCTAGCTTTGAGGCACCAGAAG 
GGCAAAT GCAAA TTAAAAATGAGAATAAGTTTATTCTCCTTGGTGAAAAA 
AAAAAAAAAAGACnTOCCC3aCC3mTC 

30 CAAGTTCCTTCCTGGACTTTTTTTATGTAGATCTGTTCAAAAGCTAAATA 

GCC ACC ATTTG AAGTGTA ATC ACC AAGGGAG ATAC ATCCTTATCTCCC AG 
TTTCCGTGGGCAAAGGGGAGCCTAACTTTAGCCCGGTGCCTAGCTCAAGT 
TCC A A AC AC ACTTCC A GTCTTA A AGG A ATG A ATTT ATTTTTTTTCCTTT A 
35 GGCAAACCCAGGTAGCCACCACAGTTACCTGGGGATTCACAGAGAACTGT 
GTGTGACCACTGGTGCTGTCAAGTCCTCTTACC T G A G CA CCTGTGACGTT 
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TCCCTTGAGAACGTGTACGGGATGGGTTGCACCTGGTTATATACAAGCGT 
(L\GAC'ITCriTTC'TGCX. TITCiTAATTTATrAGCAGATTATCTGTGATGAGC 
ATCGCAATCTGTTTAATGCCTATTCAATAATTAAATTTTTCTTTCTCTTC 
TTTTGTGGAAAGGTTTTCTGCATTGGCAGGAGATTTTTGTTTTCGATTAT 
5 GTCCCCAACATGCCTGATGrrCCACCCCTCAAGAGCCTCAGCClTGCCC A 
GGGAGGGCATGGGGGTGAGTGGCCTCTCCCACAGAGAGTGCTGGCCAAGT 
TGGCCCAGGTGCGCAGCAAGGGCTGCTGCCCAAAGGCTCCCTCCTGGTTG 



10 

The human live^-glucokinase genomic DNA is 46,000 base pairs in length and contains ten exons (see 
Table 32 below for location of exons). 

The human adipocyte enhancer binding protein j_has the amino acid sequence depicted in 
15 SEQIDNO:3^ 

MAA.VRGAPLLSC.LLALLALCPGGRPQTVL TDDE.1EEFL EGF.LSELEPE PREDDVE A PPPPEPTP R VR 
KAQAGGKPGKRPGTAAEVPPEKTKDKGKKGKKDKGPKVPKESLEGSPRPPKKGKEKPPKATKKP 
KEKPPKATKKPKEEPPKATKKPKEKPPKATKKPPSGKRPPILAPSETLEWPLPPPPSPGPEELPQEGG 

AP-LSN^-WQNP^^ 

20 ERVWPEPPEEKAPAPAPEERIEPPVKPLLPPLPPDYGDGYVIPNYDDMDYYFGPPPPQKPDAERQTD 
EEKEELKKPKB1EDSSPKEETDKWAVEKG.KDHKEPRKGEELEEEWTPTEKVKCPPIGMESHRIEDN 
QrRASSMLRHGLGAQRGRLNMQTGATEDDYYPGAWCAEDDARTQWIEVDTRRTTRFTGVITQGR 
DSSIHDDFVTTFFVGFSNDSQTm'MY 

GS LCMRLEVLGCSVAPVYSYYAQNEWATDDLDFRHHS YF ajmQ^lKVVNEECPTITRTyS L G K 
25 SSRGLK1 Y AMEIS.DNPGEHEL GE PE F RYTAGIHGNEV.LGRE.L.LLL.LMQYLCREYRDGNPRVRSLVQ 
DTRI H LV PS LNPDGY E VA AQ MGSEFGNWALGLWTEEGFDIFEDFPDLNSVLWGABERKWVPYRVP 
NNNLPIPERYLSPDATVSTEVRAIIAWMEKNPFVLGANLNGGERLVSYPYDMARTPTQEQLLAAA 
MAAARGEDEDEVSEAQETPDHAIFRWLAISFASAHLTLTEPYRGGCQAQDYTGGMGIVNGAKWN 
P R TGTINT^FSYLHTNC LELSFYL GCDKFPHESEL P RE W ENNKEALLTFMEQVHRGIKGVVTDEQGI 
30 PIANATISVSGINHGVKTASGGDYWRILNPGEYRVTAHAEGYTPSAKTCNVDYDIGATQCNFILAR 
SN W KRIRE 1 MAM N GNRPI PH TOPSRP M TPQQR RLQ 

PPTLPPAPATTLSTTIEPWGLIPPTTAGWEESETETYTEVVTEFGTEVEPEFGTK\rEPEFETQLEPEF 

ETQLEPEFEEEEEBEKEEEIATGQAFPFTTVETYTVNFGDF 

and is encoded by the genomic DNA sequence shown in SEQ ID NO:28.;- 

35 
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CAGCAGGGCCAAGGTCTTGTGACAATGTCTGGAGGTGCCCCTATTGTCACACTGGGGGTCTCC 
X ACTGGCCTGCAATGGGAGGAGGGGCTGCAGCCCCACATCCTGTGCAGAGTGCTAGTGCTGA 
GGCGGAACCCTCCTCAGAGCTGCCCCTTCTCCTCCAGGTTGTTACCCCTTCTACAAAACTGACC 
CGTTCATCTTCCCAGAGTGCCCGCATGTCTACTTTTGTGGCAACACCCCCAGCTTTGGCTC 
5 C AAAATCATCCGAGGTAATmTGTCrrCTGGGGGCCCAGGCI'GAlTTGCTGAnTGCTCTCAC 
CTGGGGACAAGGTTCACAGAGAAGAAAACCTGCATTGTGGAGTCCCCCTGGCCCTTGTGGGA 

TGGGCAGGGACTGGAACGGCTCCCAGCCTGTGTGCCTCTCAAGGCTAATCTCTGGTCTCCT 
A-CTG-T C ACTGC ^ CCC ACTGTGTGCC A ATGGGG ACTCCTGTTT ATTTCTGGC A GCTTCTCTTTG AG 

10 GCAGGACTTACTTGGAACCTACAGTGGGTCCTATGTGACTTCTTTGCAGGTCCTGAGGACCAG 
ACAGTGCTGTrGGTGACTGTCCCTGACTTCAGTGCCACGCAGACCGCCTGCCTTGTGAACCTGC 
GCAGCCTGGCCTGCCAGCCCATCAGCrrCTCGGGC'ITCGGGGCAGAGGACGATGACCTG 
GGAGGCCTGGGGCTGGGCCCCTGACTCAAAAAAGTGGTTTTGACCAGAGAGGCCCAGATGGA 
GGCTGTTCATTCCCTGCAGTGTCGGCATTGTAAATAAAGCCTGAGCACTTGCTGATGCGAGCC 

15 TTGAGCCCTGGGCACTCTGGCTATGGGACTCCTGCAGGGGTGCCCACAGTGACCATAGCCCAT 
GCACCCACCA GC CGGTCT CC CTCCTC C C CATCC CTG A C ACCTCAGAATGTGA GC A GT C CG TGCC 
ATGAGCnTGTTTTATTGGAGTGACCTTGGCTCCCT 

ACCCCATCTCTTACGAGACTGGCAGGTGGAGCAGGAGCCTCTACACAGCCTCTGGCTCTTAGG 

IXLXXAGTCATG^ 

20 GGGGCCTTCCATGTGCTGATGGGTGATGTGACTGTGGTCAGCAGGCTTGGGAAGTGC 

'1GCTGCTGTAGCTTGAGTTGGGCTGGGGTCTTGGTAGGACGCTGATCTCAGAAGTCCCCAAAG 
XK^CTG-T GTAGGTCTCTACTGTTGTGAAGGGGAATGCCTGGCCAGTGGCTATCTCCTCCTCTT 
TCTCCTCCTCCTCCTCTTCCTCAAACTCGGGTTCCAGCTGGGTCTCGAACTCAGGCTCCAACTG 
G G TCTC AAAC 'r CGGGCTCCACCrrGGTCCCAAACTCGGGCTCCACCTCGGTCCCAAACT 

25 CTGTCACCACCTCTGTGTAGGTCTCAGTCTCCGACTCCTCCCAGCCAGCG GTGGTTG GCGGTAT 
GAGGCCCCAGGGCTCTATGGTAGTGCTCAGGGTGGTGGCAGGGGCAGGGGGCAGCGTGGGAG 
GCACAGTGTGGGGGCCTAGGGTGGTGGTGGCGTTGAGGCGCCGCAGCCGCATCTGTGCCCGA 
AGCCGCAGGCGGTGTTGTAGGCGTCGCTGCTGCAGGCGTCGCTGTTGGGGGGTCATAGGGCG 
CGATGGGTCTATGTGTGGGATAGGCCGG r rTCCCG'ITCATGGCCATGATCTCCCGGATGCGCrr 

30 CCAGTTGGAGCGAGCCAGGATGAAGTTGCACTGAGTGGCCCCGATGTCATAGTCAACATTGC 
AGGTC TT G GC C iC TCGGGGTG TA GCCCTCCGCGTGGGCTGTCACGCGGTACTCAC C CGGGTTC 
AGATTCGCCAGTAATCACCACCACTGGCTGCGGAGGGAGAACGATCCGGCTGCCCCAGAGCGC 
€(Xa£CeAGG€C^eGA€GG^^ 

CGCCC CC AGCACCGCCCTCCCTCTCTGAATTrCGCCCCCAGGCTCCCCAGACTCTACCTGCT -CGC 
35 TGAGTTCCTCAAGCCCCCACCCTCTCTGGCGGGTCCTCCCTCAGAAAGATGGGGTAAAGGTGT 
GCACACTAG G^ TACCTGTCTreACGCCGTGATI\A t Al'GCCACTCACAGAGATGGTC G C G TrG GGA 
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ATGGGGATGCCTTGCTCGTCCGTCACCACCCCCTTAATGCCGCGGTGCACCTAGGGAAGCAGG 
TCA GGGCTGCTGGTCCTCAGGAAGGTCCAATGTGGTCCGCTGCTCCCTCCCGCCCATCCAGGA 
GCCTGTGCAGCCTCCTCTCCCCAGGCATTGCCCTAGCCACCCCACCTGCTCCATGAAGGTGAGC 
A ^XjCCTCCTTGTTGTTCTCCCACTCGCGGGGCAGCTCACT CT CATGAGGGAACTTGTCACAGC 
5 CCAGGTAGAAGGAGAGCTCCAGGCAGTTGGTATGCAGGTAACTGAAGTCATTGATAGCTGGC 
CGGGGACAG AT ACAGACCCAAAGTCAGCCCC TCTCC GGACCAGGCCCCGCCCACAGCCCCTCC 
CAG GCTG A CTCACT C C CG GT C C GG ( X ? GT TC C A C TT G GCCC CG TT GACGATGCC C A TG C C G CC G G 
TGTAGTCCTGGGCTTGGCAGCCTCCGCGGTAGGGCTCGGTCAAGGTGAGGTGTGCGGAGGCG 
AAGGAGATGGCAAGCCACCGGAAGATGGCGTGGTCTGGAGTCTCCTGGGCCTCGGAGACCTC 

10 GTCCTCATCCTCCCCCCGGGCTGCTGCCATGGCTGCGGCCAGCAGCTGCTCCTGGGTAGGCGTG 
CGGGCCATATCGTAGGGGTAGGATACTAGCCGCTCGCCGCCGTTCAGATTTGCTCCCAGCACG 
AAGGGGTrCTTCTCCA'rCCAGGCAATGATGGCCCGGACCTCCGTGGATACCTGGAGTGGCCAG 
CACGTGTGAGGCCAGGGCTGCAGCTCCGGCCACTATCCCCAACCTAGC C CGATCACCCTCCATG 
AAGCTTCACACCAG TACT CGCACGATCCCCTGTCCCC C AACCCCCAG A GC CT CAGCGTCTGGAG 

15 TTCAGGCACCGTCAGCCCCACCCCCAAGCCCAGAACACCAGGACCCCAGGGTCCAGCTGCTCCC 
TCCTGC^GTTTC AGC C A GGC TGTAG CCTCA CCGTGGCATCTGGCGAAAGGTAGCGTTCAGGGA 
TGGGCAAGTTATTGTTGGGGACCCGGTAGGGGACCCATTTC 

C AG AGTTG A G ATCCGGG A A ATCTTCAA AGATGTC A A AGCCCTCCTC AGTCC AC AGTCCC AGCG 
CC^AGTreCCAAAO^^ 

20 CCACAACCCCCAGCTGCCTGGACCCTGGCCAGCCTCACCCTTCAACCCACCATCTGCGCTGCCA 
CCTCGTAGCCATCAGGGTTCAGTGAGGGCACCAGGTGGATGCGTGTGTCCTGCACCAGGCTGC 
GCACACGTGGGTTCCCATCGCGGTACTCTCGGCACAGGTACTGCATGAGCAGCAGCAACAGCT 
CTCGGCCCAGCACCTCGTTGCCATGGATCCCAGCAGTGTAGCGGAACTCGGGCTCCCCTGCAA 
GGGCGGGAGCCTCAGTGAGCACTCAGTCTCCCGAGGCCCAGGGCAGCTGAGGAAGGACCCAG 

25 ACCCACCTCATACCCGAGGGTCTGGGGGACAGCTGGGGCTCCTAGGGCCCTGTAAGACAAGCC 
AGAATCCCCAGAGAGG C TCCGGAACAGGCGGGAGGCAGTGAGCTCTGCACATCAGCAGCAGA 
GGCCAGCTGCTGGCCCCCACAGACCCTCCCCCAGTTCATGCTCCCCAGGGTTGTCTGAGATCTC 
GA TGGCATAGATCTTGAGGCCTCGTGAGCTCTTGCCCAGGCTGTAAGTGCGGGTGATGGTGGG 
GCACIgCJgGrrCACCACCTTCATGAGCTGGCGCAGAGGGGGAGGACGTGGAATCAATCATGC 

30 AATCCGTCCCCCGCTGACCATGCCCCTTCCACTTCCAGGGCCTGCTCTATGGCGAGGGACGGGC 

CCTGGAACCAGTAGGGACTGGGCCCAGTGACAGAAGCACCAGGCACACACTCCCGTCAGCCAC 
AGACAG G TCCC A C CCCCAGCCC CAGG A TATAT GGTCCCAACCTGGCGCATGTCCTTGTAGCTCT 
GGTGCCGGAAA TC CAGGTCATCGGTGGCCAC C AC CTCA TTCTGTGCGTAGTAGCTGTAGACAG 
35 CTGCAAGGGAGGCGGGGTTGTCTTTAGCTGGGTGCCGGCTGGCCC 



50 



SUBSTITUTE SPECIFICATION-MARKED UP VERSION 



ACCCTAGCACCCCACCTCCACTCAGAGCCCCTGCCAGCCCTCCACACTCACGGGCCACAGAGCA 
CC C CAGCACCT CCAGG C GC ATGCACAGGCTGCCAT TC CAGGTGAGTGGGTAGATGCGGATGAA 
ACGAG C CACCACCGGCTCTGGGAGCTCACTCAGCACGGGTGTGTCCTTGTCCACGTTCCCATG 
AA AG GTCTGG GGAGAGGCAGGCCTCAGAGCAGTACTGCCAGCCCC TCT GAGAGCCCACCC 
5 CTCGCCCAGACAATGGGAGCAGACrCCAAGAGCCTGGGCATGGTGCCCACCATrrCCTCATAGC 
CGTTGGTGTACATCACCCATGTCTGGCTGTCATTGCTGAAGCCCACGAAGAAGGTGGTCACAA 
AATCGTCACTGTGGAGTGGACAGTGGTCAGAGCAAGGGTC'rrCCCCCTCCCAGGCCCTCAGGr 
GGCCTGAGCCTCCCTCTTCCGAGCCCCAAGAATTTAAGAGCTAGCAGGGTGGTGCTGCACG 
GCCCAGGTGTTGAGCCTGGGTCCTATGCCCGTCACATAGCCATGGGCAGGTGATCTGTCCCTA 

10 AACTCATGTGCTATCAGGACACAGGGGCTGACTGACCAGGCTGAGGAGTGGGGATGGGCAGG 
GTGAGTCCCTCACTGATCTTTTTGGCCTTCTTTGGCTGGGCCAAAGAAGGGCCCACTGGAATC 
TCCCT AATGGGACACAGAGCCATGCCTATGTAGCCACTCCCCTCTGCCAACTATCCATGA GC 
CTGGCCACGCACTGGATGCTGGAGTCTCTGCCCTGGGTGATGACGCCTGTGAACCGGGTAGTC 
CTCCTGGTGTCCACCTCTATCCACTGGGTCCTGGCATCGTCCTCGGCACACCACGCACCATCAT 

15 AGTAGTCGTCCTCAGTGGCACCGGTCTGTCCAGGGGGCAGGGGAGGCTGAGCATGGGCGGAG 
GAGTCCCTTATCCCAGTTGGGAGATGGGCCCATCCCAATGCCCACCTGCATGTTGAGCCGG 
C CGCGCTGTGCC CCCAGGCCGTGGCGCAGCATGGAGGAGGCTCGGATCTGGrrGT CCTCAATA 
CGGTGTGACTCCATCCCAATGGGGGGACACTCTGAGGACGCGTACCCCAGAATGGTGGCTCAC 
TACKrrCCATCC TO CCTCC ACC AA ACCCAGAACCAAGGAGCCCAGAGCCCACTCCCGGCACAT C 

20 GGGGGCACAGTCAGAGGGCAGCTCTGGTCAGCTGGTGGCTCCCTGGTGCCCTGCACCAGC 

CCACCTGGAATCGACTCAAAGCCAGGCCAGGAGCTGTTTCCAATCCCAGCCTGTGCTTCCCCTC 
CCTGGGCCTC AGCTGCCC C ATC TGG AG AACGG GC TG A CC ATGCCC A GCTCTC AGGGG AC AC AC 
GT G AAATCACAGGTAGAGCTCCCCCAGGGCGCAGCCACAGATGTCATCCAGATGGGGACCGT 
C TGCA CAA TGGCCCTGCAGGGATACCTGT GA A G G T A CCT GAG GT CCTCACTCCCCACCAAGGC 

25 CCCAGGTCCTCCCCCTACCACGCCCAGCCACTAGGGGCCCTGGGGAGCTGCCACCCTCCTGAAG 
CAGGCCAGCCTGGGGTCCAGGGCT GG GGCAGCCAAGCGAGGCTATCCTGGGCTCCCGGGGCCC 
CTCCCTTCTGGGTCCCAAGAATCTGAGTAGGAAAGGGTTCCGGGGACCTGGGTCCTGTTTGTG 
ACATTGGGCCAGTCACTTGTCCCAGCACCCCCATCCTGTGGCCCCCACCCTCACCCCCTTGTGCC 
C CCCACTrACTGACTTTCTCCGTAGGCGTCCACTCCTCCIXXAACTC^ 

30 TAGGGACAATGAAGGGAGGACATGGCACCAAGGGCCCGGGAGGCAATCAGGAGTCCAGATG 
CTGCCCCACACKjGA CCC AG G CCCCAAG^ 

ACTGCCCACTTGTCGGTCTCCTCCnTGGGGCTGCTGTCCTCCTTTTTGGGTTTCTCTGGAAGGT 
GCA AGGTAGGAGGGGCCAGTCAGCCTGGCTCTGGGCTTTGAGGACCATGTGGGGTGGATCAG 
GCAGGCCCCAGGTGGCCTTCAGGGCAGGCCTGGTGTGGGAAGTCCTTGGTCCCACTCACTCAG 
35 CTCCTCCTTCTCTTCGTCCGTCTGGCGCTCAGCATCGGGCTTCTGGGGCGGAGGAGGCCCAAAG 

T-AA^'AG^-CXACTA.^^^ 
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GCTCCTGGGTCCACAGGAGCTCAGCAGGAC A GG AC CGCGCCAGAGGGGAGGAGGACGGGAG 
A TG G GGG A C ArjCTGAGTTCj G Ci AG A GG G TCTTGCAGGAGTCAG GAGCA GCCCGAGCTCAGGG 
G C AGC T GA G CAAGACCCTGCTGAAGTCACCAGCCCGGCCTTCCAGGAGCATCTGGCCTGGGGA 

5 .TTGCGGGCCGGGGGGTC A GA TC 

GAAATGGCGATGCACGGGCTGCCACCCAGGAGGAAAGGGAACCTGAGGGCTCCAGGGACGCA 
GGGGCATGAGCAACAGGGAGGCAAAAGCCCTCGGGCTCCCTGAAGAGAGTGGGGCAGTGGCC 
ACGAGCCAGCGGGAAGCCAGTTAGAGCACAGGACTGGGAGGGCTGGAACCCACATGGGTGAC 
AGGGCAGAGTGTGTGCCTAGGGACACCCCTGTGGGGGTCACAGCCAAGCAGGAACCAGG GAA 

10 GCGGCCAAGGAAAGACCAGCCTGAGGGCAGAGGAGACAGGGCAGTGGCTGGGGTGGGCACG 
CAGGGACAGCAGGGACAGCGAGGTAACCACGGGCACAGGTGGGGTTGCAAGGTGGGTGAGT 
K iCCCCAGCTGGCTCCTGACCACACCCCAGCCCCGACCCCCACCTGCCTATGTCCCTCAGACTCT 
GGGGTGCTGGGTACTCACTGTCATCGTAGTTGGGGATCACGTAACCATCACCATAGTCAGGGG 
GCAGC GGGGGCAGCAGAGGCTTCACAGGAGGCTCTGGGGAGGCGGGGAGGTTAGGAGGGGG 

15 CCAGAGCGCCGTGGCCATGGCACCTCCTCTCCTGCCCCCCATCCTACCAATCCTCTCCTCCGGG 

TTGGGGGTGGCCTGGGTTGCTTCTGGCGCCGAATGTACTCAACTGAGGGGGAGGCTGGCTCA 
GAGTGGGGCCCAAGGCTGGGATGGGCCCATTGGCACATCCCCCAGGCCAGGGGTCCGACCCA 
GGTGGGGC T GGC A GGACCCrACTCAAAGTCCTC A TAG T CC T CCC r CTCGATCTGGTCATTGTA 

20 GTCCAGTGTGGGTTGCTCGGTCTCCTCCTCCGGCTCTGAGGGGAAAGCGCTGGTAGCTGCCTG 
ACAACCCCACCCAGGCCTACTCTGGGGAAGCCCTCAGTCCAACCAGCCAGGGCAGCTGGCCCC 
A AGGCC AGGCGG ATGACGGCC ACTC ACC AGGCTGGTGCTCCTGTGCCTCC AC ATGGGTCTCCT 
CTCCTGGATTCTGCCAGTTATTTGAGAGGGGCGCCCCTGCAACACAGGAGTTCCAGAAGCAGG 
TGGGCGGGAGGCCTGCTCTGACCACCTTGGGAGCCTCAGGCCACCAGCCACCCA T AGAGC CC A 

25 CACAGAGCCTGTGGACACCCTCCTGAGGCCGAGCTCACTCCAAGGAGGCCTGAGCTCCTCTGG 
CCTTCAGCA T CCTGCTGGCATCTCATGGGGCCAGAGAGCTGGGCCCACCTTCTGGGGAACCTA 
CTGTGCTGCTGGAGGCCCTACCACAAAGCTGTCCCCAGCGGGAGAAGGCAGGAGGGAACTCC 
ATGGGCTCAGAGCCCAGGGACATCTGGGCAGGGGCCTGAGGGACAGAGGTCCCACCCAAAAG 
GCTGCCAAGCCCTCTCCCTACCCAAAAGAGGCTACAGCACTGAGGGAGCCCACCAATCAAATT 

30 GTGAAATTTATAGCAAAAGTGAGGTTCCCATCCAGTGGGGAGCTGAAGGTCTATAGGAAGCA 
GGGCCCCAGAAAC CT GC C TCC C ACTCCCTG CCTC CA C CCGAGCAGGC A GTCAGA G CCCCAT C A C 
CCCAGAGGAGCCCGGCACAAACCTCCCTCCTGGGGTAGCTCCTCGGGGCCAGGGCTGGGGGGT 



GGGCTTCTTGGTGGCCTTGGGTGGCTTCTCTTTGGGCTTCTTGGTGGCCTTGGGTGGCTCCTCC 
35 TTGGGCTTCTTGGTGGCCTTAGGTGGCTTCTCCTTGGGCTTCTTGGTGGCCTTGGGTGGCTTCT 
CCITCCCCTTC'ITGGGCGGCCTGGGGGACCCCTCCAAGGACTCCTTGGGCACCITGGGGCCTIT 
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GTCTTTCTTGCCTTTCTTCCCTTTGTCTTTGGTCTTTTCCGGAGGCACTGTCCAAGATGCAGACT 
CGTCTC A AA TG A AC AG AGCC AGCTCTGTGCCCCC ATG AGGCCCCTCTCT AG ATGCCC AG A ACC 
TGGGCACAGGGACTCTTGTCAGTTCCCAGTGCGGATCAGCAAACTGAGAGGTTAAGTCATTTG 
CCCAAGTGGCAAACTGGGATCCGGACCCAGATTTTCTGTCTGCAAGTCTGGGGCTGTGACCAC 
5 CAATCTCAACCTCTCTAAAGACTGAGCGTAGGG1TCCCAGTTCCCAGGGGGAGGCCCTCATCC 
CCCCACCTGCCAAAACCTCAATAGGGGTTCCTTACTATCCACTCCTCCACTATTCTGTTCTGGG 
CACAGAAGGGGCAGAGAGGTGAC r rGAGCCATCCAGGCCTGGAGGAGCATCTGGTCATCCCTG 
CCAACTGCCATACAAAGGAAGGGACATGGGCCCAAGACCTTCCCCTGGTCTCCTACGGGGCAA 
G A AA AGCTTC A A AG AAA AGGGACACTTGGTTG AGTATTG AAGCCCAAAGAAGAG GAAGTGG 

10 TCTCCTTTCGAGAAGTAAGGGGTTTGGAATTGATTGGAAGGATAGGGAGTCCTGGGGGGTTC 
AGGGATCACACAGAGGACAGAAAAGACAGGTAGGGAGCTTGTGGCTGCACACTCATTTCAGA 
G TCTGGGAGAGGGAGCAGGGACTGGTrGTGAGGArrCCCCATGGGAATCCTCCCAGGACCCT 
AAGCAGGAGCTGCAAGTGCTGTTGAGAACCTGATGAGAGGTGGGGAGCATGAGGGAAGTTT 
GGCAGAAACACAGGAAAGCTACCAAATGCAGACAGCCAGGGGACGCAGGGCT GCTA GAGCG 

15 GTGCCCCAGAGCCAGGAGAGCAAGCCTGGAAGGAGAGCCAGAGGCAGGAGGGGCACAGGCA 
GCCGA GGGTGTGGGAAGCAGCCAGGAAAGA TCTAG AGCTGGGGTGGCAGGGGAGGGGCTGC 
TGACATCAGGAATGlTGGATGGTGCCTrGGAATCTCCTGGGAGACAGGGATCACAAGACCCT 
CTGCCACCTTCCAGAGGGCCACGATGAAAACAGCTAAGATTTACTGACAACTGATTATGCAAG 
AGGCC G TGGGTTAAATGCITCAGTGA T GCA T CACCTCATCTAATTTCCTGTACT 

20 CCACCCATTGCTCACCACCACCTGAAGCCCTGTGCTCACCACCACCTGAAACTCTCTCACCTAC 
G TGAGACCTCC T GGAGTAGGAGGGCAAAGGCAGGAGGGAGGGACGACGTGAAGCTGTGCCA 
C C AAC A GG G AGAGTGGTCCCATTAGTATGGCAGGGGG T GACA C AG CA C AGTC C CCTG TGG CT 
CAAGCCTAGTACCTGTCGCGTACTGGAGGAATGGGGATAAGCGACCCGTACAACCACAGCAC 
CAACCCTAGAGCCACCGGCCCCCAAAAGCGGCCCTGCCGCCCGGGTGCTGGATGTGCCTCCAC 

25 GCCAGCGCTGACCTCGGCCTAGCACAGGGTCCCTCCAGGCATCTGGGCTCGCGTGCGCATTAG 
T-A AGCCAGCCATTCCTCCCCTAGCAGACTGGGGAGTGGCCAGACCCTACCGAATCCCCCTGTTC 
CCACCTGAGATGCCAGCCCCCCACACCCCCGCCCTGCCCTGGGCTCTTACCTTCTGCGGCCGTCC 
CTGCK^CGCTTCCCTGGCTrGCCCCCCGCCTGGGCTlTTCG G A CCC G C GGGGTGGGCTCGGGAG G 
C G GCGGGCiCCTC CACGTCGTCCTCCCGGGGCTCAGGlTCTAGCTCTGACAGGAAGCC 

30 GAACTCCTCGATCTCGTCGTCGGTCAGCACCGTCTGCGGGCGCCCTCCAGGGCACAGGGCCAG 

CGGGGGGCTCCGGGGAGGGCGCGGGGGGTCAGGGGCTCTGGGTCTCTGGGAAAGGGCGGAG 
AGGGGATCGAGACGGGTGAGGGAATCCAGGAAGGGGCGGGAGAGAGGATGGGGTGAGCG A- 
GGGAATCCGGGAAAGGGAGGGAGAGTGGATTAGGGTGGGCGAGGGGACCCGGGAAGGGGT 
35 GCTGGGGGGCTCCGAAGCCAGAGGGGCTCAGGGGTGGTCGGGGCGCTCCGAGGTCTGGCGGC 
TAATAGGCGCTCCGGCCCCGCGTGGCGCACTCCCGCGCGGATAGCCGTCTCCAAAGCGCTGGC 
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GGGGCCCGGGGCGGGGGCGCCGGGGCTTCCGGAGCCGGCTCCCCACCCCCGGGGAGGAGGAG 
GAGGAAGAGA.AGGAGGAGCCGAGAGTGGACGGAGGGGCTGCGGGGGGGCGGGGGGCGGGG 
GGCGGGGGGCTAGGGGCGGGGCAGGCGGGCGGGCGCTGGCGGCGAGCGTCCCAAGCCCGGA 

GACTTGCGC^T^QGAC^ 

TCTGCCCTAGGCACCCCCCTTAAGGCGGGACCC C GAGTCCACCGGGGCTCTGAGC C CT C CGC G G 
GTGACCAGGAACCCTGGACGGAAAGCCGTGGTGTCAGGCCTCTGAGACCTCTCTCA AT rCGGA 
GGGCCACAGAAAGGCCACCCCATCCTTCCCAGGCTCTGGAGCCTCTGCCCATGGGCCCTGCTGC 
AT CCC A GCGT GAATTCATTCAGTCATCCTACCAACCTCTTCAGGTCGGTGTGGGGCCGGGCCCC 

10 GTGCTGGGCCCCAGGGAGGGACAGCACAGTGGGAACTCACTTTCCAGCCAGGAGGCAGGTGC 
AAAACTGCCCTCAGAGTGGCCAGCTGCCCCGCTGGGGGTAGGAGTCCCATGTAAGGGCATGCC 
ATCCCTCCCCTCCGGGTCCCAACGTGGACAAATAGCCATLTATCACCTTCTTCTTACCAGAACT 
CATTTTTTAAAAAGTGT C TACCATACCTCCAGCTGCCACATGGACCCAGAGGGCCCAGAGGAC 
CCAGAAGGCAGGTCKjATTGAGTGTCAACTGATCCCAGGA TCC A TC AGGGATGTCCACCTTGG 

15 TGCCTGGTGTTTGCCATAAGGCTTCTCCAGGGCAAATGTTGGCTGCCCTACAACGGCCATCAA 
Cw-VCiGeAGAG TCGTCCCATTA CTATOG^^ 

CTAGTCCCTGTCTCATACTGGAGGAATGGGGAGCTAAGGACAGAGCTCCGAGGACATTCCCCC 

TTAAAGGAATGAGGACACAAGAGAAAGCTCACAGGTAGTCCATGGGCCAAGTGCAGAGGCA 

GACAGCCCT A AGCCACGATTGTCT GCGGGGTTT G GCCCCAGTGAAG TA G TC AGG T A GGGAAG 

20 CCTAGGAGCC C CTG G GATGATTGACAGGGCAGAGTTTGGACCTGGGGTCAAAAGGAAAGAGG 
AAAAGTGGGICAGGAAGCACCTGGGT C C CCAGAGC A GCCCC GAGTGAGTTGGAGCAGGCAGC 
AGCCGGGGAGGCCACAGTGGAGGCTGCTGGGCCTGGGATACATGCCACCCCCTGGGAGCAGG 
ACCACAAGGAGGCCTTGCCTCCTCTCACACCTGGTCCTGCCAAGACCCTGCCTTTGCTTTCTG A 
CTGCATCTCCTTGAAAAAGCAGTGGGACTGTGTCAGGTTCTGGCTCTACCTCCCAGGCACCAC 

25 ATCTCGGCAGGTAGCCTCAGTGCCGTCCACCTGTGTC 

CCTGTTCTC C TTGTCGTTC ATAC AGG ATC ATGC ATGTGC TGTGCCTA GC AC AC ATTCTTGGC AC 
TCACACTGCTGCC TTTTAG C T CTCATCATTTGCCCTCAGAGATCAACC TG A GCTG TGCCCACTG 
GGGCGCTCAGACtCAGACCCTGAGCCCCAACACCCAGGCTCCCTGTGCACCTGAGCCTGCCTCT 
GCCTGCCACGTGCCCCCAGGCCAGTCCTGGTGGCAGCAAGGATCCGCAAGCTCTCCCCriTCCT 

30 CATCCTCTGCAAAGCTCTGAATCATCTTTCTCAAAACTTGTTCTGGGAATTTGCTCCGTTGCCC 

CTGGGAACAACTCACACATGGATTGGATTTGGGTCCAACATCCTCTGCCAGGGAAAATAGAA 
C<X: A T AAGAAAA CAA A A AA G G A A CAG A A GGAGGCTTTTCTTCAGTC,ACAGCGAGTCACC A AC 
AAA AAC A TGTGC AAAAGCT C TC A TGGAGAGCTGGGCCACAAGGAGGG CC AT GATGTT GGG GG 
35 CCCTCTGACACCAAGGGTGTGGGCAGGTGGATGGGAGGCAGCTGCCCTCCATGCCAGGCTGAT 
GTGCCTCCC r rTTGGGTGGTGGGGCTGGGACTCCCACTCCACTTGAAGACCTGCACCAAAAAGT 
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CCTTTAGCCCTGTGCCCAGGCTCTGCCACGGGGCCGGTGAGGGGACTTCTCCCCTCTGCTGCCA 

C CCTTAC C CC T GCCAC r rTAGCAGGGTCTGCACCTGCATCCAAGTGTTCTCCTGGGCTACAGTGG 
GGGGCTGGTAGACACTCTGGTGATCCACTTTCAGCTTCCCACATGGATGTGGCAGGGACTGCT 
5 T T GG C A TI TCCCTACCC CA A GGGA CAGCCACTGCGGCAGGACTGGGCTGGGGAGGGTGGGGC 
CTGC G CTGGGGAG GGTG C CC CCTGTCCCTTGCTGCTGCTGGAATGGGAAGGAGAGTTGTTGAG 
AG AG CCAG AA C TG T CCAAGGG TG GAAG CTGGC GA A A CT GA CCT GCAGGGAACAGGGAGACA 
GGGAGCATGGCCCAGTGAGTAGGTCCTATGTAGCTCTGAGGCCATCAACCCTGCCA T GAGGGC 
T GA GAC CC CAAGAGAGAAGTTGAGGTTG GGTCAGG GGCC TGTTA GT GCC AGCTGAGGAGGGG 

10 GACAGGCC AGCCTCCTCCCACTGGGACCCAAGCTATAGCTCCTGAGCCTCCAGAGCTGCCTGG 
TGCCTC AACCTGGTC AG AGGTGG A A ACTC ACCTGCC AGC AGGCCC AGTGTGCCTG AGTTCTG A 
CTGTGGGGATCTGCAGGGCACAGAAGGATAAGAGGTCATCAGGGCCTGGGGACAG G CAGGA 
GTGGCAGGGTCTGGGAGGCTGGGAGCAGACCCTCCCAACCTGCCCCATGGCCTCCGTGGCCCC 
CAGGACCCCCATGGCAGCAGCTCAGACACGGGTTGTGCCTCAGAAGGAAGTGAAGCTGTGT -G 

15 TACCGAGATGGCCCAGCAAACCCTTTGTATGTAAACTTCCGCCACAGCCCAGCTGTCCAGCAC 
GAGCATC^ JATCTGGGGGAGGGGGATAAAT A GAAGGTCTGGGAGGCCT GGGA^ 
CAGGCTACTGGGATCACAGATGCCAGCCCCTCCATATCTCCGCTTGAGTCCTGGATCTGCCTCC 
TGGGACCAAAGGGGAAAGGACCAGGCTAGGCTCCTTCCTTTTTGTTCTTCCCTCTTGGGGGAG 
GCTGCFAGAAACrCCX^CITC 

20 TTTGGGCCCACAAGATGGGATGGCTCCCAGAGCCATGGGACCTGAGGTCTCCCAGACAGTGTC 
TAGCCACCCTCACAACTGGCAGAACAATTTCCTTGGTTTTCAACAACTTGAAAAACATATGTG 
ATTTTCCACAGTCCGGTGCTTCTCAGGCCTGGCTGCTGAGTGAGCAGAGTTCATGCTGAATTC 
CTTCCACTCACCACAGGGCAGACAGCAAGCCCAGCTGTGGGGACTCGGTTGGGGTGGGGGTC 
ACCACAGCAAGGCGCGGGGAGTGGGGAGGGGGGCAGGCTrCCAGCACTGATGAGTAA'rrCTG 

25 CTGCCCGAAGATCTGGGAAGAGGGCATGTGACAACTTAGTGCAACAATCTGCCCAGTGTTAG 
G TCAGAAGGAAGGAGAGGTCGTTCAAAATGGAGTCTGGTGGAAAAAATAATGTTTGGCCCCA 
CCTCATACCTCCCTCAAAATTAACTCCAGATTAATGAGGTAGATGTTAGAAGAGGAACCAGG 
GAAGGACTACAAGAAAATATGGAGTCTTTATTTACATTGTGAGGTTTTCTTTAGGTTTTGTrT 
GT1TTTGTTITTGATATGGAGTCTCACTCTGTCACCCAGGCTGGAGTGCAGTGGTGCGATCCC 

30 GGCTAACTGCAACCTCCGCCTCCCAGGTTCAAGAGATTCTCCTGCCTCAGCCTCCCAAGTATCT 

GT AG AGATGGGGTTTC ACC ATGTTG ACC AGGC AGATCTC AA ACTCCTGACCTC A AGTG ATCC A 
CCCGCCTCAGCCTCCCAAAGTGCTGGGCGCCCGGCATGTGTGCCCAGCCTATATTGACATTCTT 
GATGGAGAAGTCTCTTAAGGAAGGACAGAGAAGTTTGGTTGCATAAAAGTTTTTACCTTC T G 
35 TAG ATC AAA ATATACTG A A A ATG AAA AT AA AG AGC AAAC A A AAT ACTG AG AAAG AATGC AG 
TGCTTAGAGAGCGAACATTCCTGGCCTCCTGTAGTTTTAGGAAGCAGCTGTGGCCTCAGAC 
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CCATCTGCTGTGAACCTCTACTCCATATTTATTGCACTTTCTGTCTGTGAGCGTCGGTTTCTCTC 
G-TCT A T AACAATA G GATAA T AATGACACTACCATGCCTTGCAAAAATGCTACAAGGGTTCACT 
GAGATAAATCTGGAGAGTCATGCCTGAAAAATAGTAAGTCGTTGATAAAGGGAAGCTGCTAT 
1 AATAAA rAAAGCTIlTl'CTr'ITTTlTTITTTTGAGATGGAA rCTCACTCTGGCGCCTAGGCTG 
5 GAGTGCAGTGATGCAATCTTGGCTCACTGCAACCTCCGCCTCCTGTGTTCAAGCAATCCTCCTA 
CTTCAGCATCCTCAGTAGCTGGGACTACAGG T GCGCACCACCATGCCCGGCTAGTTTTTTACAT 

CTGAGGCAGGTGGATC 

10 The adipocyte enhancer binding protein 1 is 16,000 base pairs in length and contains 21 exons (see 
Table 23- below for location of exons). As will be discussed in further detail below, the human AEBP1 
gene is situated in genomic clone AC006454 at nucleotides 137,041 -end. 

POLD2 has an amino acid sequence depicted in SEQ ID NO:4^ 

15 mfseqaaqrahtllsppsannatfarwvatytnssqpfrlgersfsrqya.hiyatrliqmrpflen 
raqqhwgsgvgvkklcelqpeekccvvgtlfkariplqpsilrevseehnllpqpprs.kyihpd.de 
lvledelqriklkgtidvsklvtgt\ala\^fgsvrddgkflvedycfadlapqkpappldtdrfvl 
lvsglglgggggesllgtqllvdvvtgqlgdegeqcsaahvsrvilagnllshstqsrdsinka 
k¥-ltkkt4jaa^ 

20 LVTffl ? YQATlDGVRFLGTSGQ]WSDlF'RYSSMEDHLEILEWTLRVRHlSPrAPDTLGCYPFYKTDPF. 




& 

and a genomic DNA sequence depicted in SEQ ID NO:&r7. 

CCCTCCTCCA'rCCCTGCCCCAACACCCTGAAGACCCTGGATGCAAACAAAGGCCCGAGGGAGC 
25 CTCTTCCCTCGCAGTGCAGGCCTCACCTGGGGCTCAGAGTCAGAATCTGCATTTTATTCCCTAG 
GACAACCTCTAGTCAGGGCAGAGGCCGGCTGTGCTGCCCAAGTGCCCTAACCCTAGCTTTGAG 
GCACCAGAAGGGCAAATGCAAATrAAAAATGAGAATAAGTTTATTCTCCTTGGTGAAAAAAA 
A A A AAA A AG ACTTTCCCCTCTCCTTTTTCTTTAG A A A ATCTATC ATTGC A AGTTCCTTCCTGG A 
CTniTTrATGTAGATCTGlTCAAAAGCTAAATAAGCCTCTTTCAAG'nTCACATCCCAGGAAT 
30 GTCTCCTTAAGGACCTAGGAGCCACCATTTGAAGTGTAATCACCAAGGGAGATACATCCTTAT 

TGCAAACACACTTCCAGTCTTAAAGGAATGAATTTATTTTTTTTCCTTTAGGCAAACCCAGGT 
AGGC^CCAGAC^TT^CCTCtGGGATTCAC A GAGA A CT G 

35 TATACAAGC GTGAGAC TTCTTTCTGCCTTTGTAATTTATTAGCAGATTATCTGTGATGAGC 
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ATCGCAATCTGTTTAATGCCTATTCAATAATTAAATTTTTCTTTCTCTTCTTTTGTGGAAAGGT 
T~ TTCTGCATTGGCAGGAGATTTTTGTTTT^ 

TCAAGAGCCTCAGCCTTGCCCAGGGAGGGCATGGGGGTGAGTGGCCTCTCCCACAGAGAGTG 
GT GGCCAAGTrGGCCCAGGTGCGCAGCAA GGGCT GCTGCCCAAAGGCTCCCTCCTGGTTG 
5 GCAIGGGTCGGGACCCTGTTGTGTTGTGTTTTCGCTCTTTTTC 

GCTATGTTGT C C AGACTGGTC TTG A ACT GACCTCAAGGGATCCTCTCGTCTCAGCCTCCCAAAG 
3^GCTGGGAT3^CTGTGCCCAGC33TGTGTTGTA333TCTGA 

CCCCAACCTGGGCCCCAGTTCCTGCTGTGCCCCAGCCTGCCAGCCCTCTCTCTCTGCAT 
ATrCTTTCTTT AGCTGAG TT A A CAC CACTGATA AGG TT A A AGACAGGCTCTTAAATTTCTGCC 

10 CTGGCATGAGAAATATGTGACCCACATGCTTCTCCAGCTTAGCTGTCCAGTGTAACTGTCAGG 
GACTGATGGGCGCGTGCTGGCCCACAGCCCACCTCAGTCCTGACCCTCCCTGACAGGCTGAGA 
GAGG CC CCA G CCTGAACCTGG ACTCCCCCAT GTTCTGAT ATTCCTGCACAAGAGTGCAGAG 
GCCTGGTTAAGCTGGAGAAACATAAGGAATAGGTAGGTCTGCACACACTCACCTCTTCCTTTG 
C AGTGAACCTTCTAGAATCTTCTAGATGGAAAAGC T GGGGGTGTGGAGGTGTAGGGATAGGA 

15 CAGCTGGGG GAGGCCTT GGCCAAGGTCAAGGAGTAGGCCCAGTCTCCCTCTCTGTGTGCCTGT 
CTGGGACTCGG T TTCCTGTCTGTGAAGCAGG GCTGGACG GGATATTGACAGCACCTGATGGTC 
A IT GA GCTCCTCT GC C CCAGGCACTCAGCTGCTGGGCACAGTGCACA CGTGGCA GTCCGGTGC 
CCTCTCACGCTCCGTGATGACTGAGTCTGTAGTTACACCCCTGGCCTCAGAATAAAGACTACA 
Ca^CTGCCroCTC A CTGGC A GGTAT G ACTAGGTG TG GTGGC A GTTri^CTCCTTAAG A GACA 

20 GA T GT TTG TGCCTCCCTCCAACCCGCTGGCTAACACCTAGCTGGCACACAGCCTCCTGGGGCTA 
T- G A AGA T GA G G G CC A CAGCCACAGGGTGGGGGAGC CG TGAGCTGGGTCTG G CT G CG T CTCTG 
A C ATATGGG G G C ATCACACATCACCTCTACCTCCCATC GAATGCT ACACGAAGAGAACAAAC T 
CCACCTGATGGAAGCTGCTGTTGTTTGAAGTCTTTCATGCTCACAACAGAACCTAACCCCAAC 
CAATACAGTATGAGTA r ITGGCCCCACGTGGTTA J AGCAAGCTGTCCAAGGTTACACACAGCTGG 

25 GAGGTGGTGGAGCTGGGTTTGAGCCTGTTATTGACCTTTGTGCAGACAGACCTCAGAGCAGA 
GCACAAGGCACCAAGGCTGTGGGTCTGGGGCTCCCTCTCCAGGAGAATCAACTGGCTGCACAC 
AGCCTGGAGAGCCCATGGGCAACCTGAGTCCTTGCACCTGGAAGTTTCTGTGTCCCACACATA 
TCCAGGAGCTrAAAATGAAGATGTCTGAATTACCCAACCTCT T GATAGCACCAACCCAACCTT 
CCC AGCCTCCTCITCTGAGG T C AG CCCAGAG C AA GCCCC' n^GCAAAGCTGA'nTAACTCAGAA 

30 CC AC TGGGCATACCCACAGGGCAGTGACCCTGCAGCCCTCGATCAAATGTGCAGATGGACTTG 
G GGGTGGGCTG GTACCCCAG A TGG C C T CATTCT C CCAGGGTTGCAGAGCCCCTGAAAGCCACA 
GCCCTGTGTGCACACCACTGGGGAGTCATCACAGGATACTTCAAGAATTCAGTGCCAGGCAAG 
GT G GC TCATG GCTGTAATCC CA GCACTTCGGGAGGCTGAAGCGGGCAG ATC A C C TGAG GTCA 

35 ATCTGGGCGTGGTGGCGGGTGCCTGTAATCCCAGCTACTCAGGAGGCTGAGACCGGAAAATC 
G CrrGAGCCTGGGAGGCAGAGGITGCAGTGAGCTGAGATTGCACTGCTGCACTCCAGCTTGG 
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GGGACAGAGTAAGACTCCATCTCAGAAAAAAGAGTTCTGTGTATCATTTAATGTGGAGATCC 
TCCCATCACGAGGATGAGGCTGTTTCTCTACTCCCCAGATCTGGGCTGGCCTGTGGTTTGTTGA 
CCTCAGCCTTGTAGTTCTCACTTTCCTGGAACCTGAATGCCACCACGCGACATCCATAAGACAA 
AGCCCAGGATAAAAGATCACTTGGAGAGACAGGCCTGGCCTGGCACCACCCCGGCTGAGGCT 
5 GGACCCCTGGGAAGGAGACTCTGATGGACCTCCAGACCCAGT 

CAAATGACCACTTCCAAGGTCAGGCAAGAAGGGACAAAGAGCCACTGGCTCAGCCCACAGCA 
TCTGAGAAATAAGAAACCGCTGCATTTTTTGAGCCAGTAAGATTTGACAGGTTTGTm 
CAATAGATGAGTGGTACCTCATCTTAGCCCATGTTCTGATGAAGACAAACAGTAGCATTGACA 
AA G'ITTTAAGAAAAGTTAACCAAAAACTGGGATTCCTTTCTTCATTTTGACCCTTTGTTACAA 

10 GAAACAGAGGCCCACCCCACCAGACTCACTGTTCACTGGTCCCTGAGTGCCTGTGAGTCTCAG 
TGGGAGTTACCTTGAGACCAGCCCTTCTGAGTGGAGGGTGCTGGGTGCTGAGGTCAAGTCGA 
GCTCAGTCCAGGCTAAAAGGAGAGCAGCTCTGGCCAGGCTGTCAGGGCTGTGGCCTCCCCAAG 
AACCTCCTACCCTGGCCCCTCCAGGCTTTGCTGCTATGGTTGTGTGAGGGGAGTTGCTGTCCCA 
GCATTCTGGCCCCCTTGCCCCCAGCCCCTCCCTGACCTCCACGGGCTTCAGGCCTCAGTCCAGA 

15 GTCACCTCCTCTAGGAAGCCATCCCCCAGTGCAAGTCTGGGCAACATTCCTCCTTGCCTGGCCC 
ACCTGCTG A€ TCTC ATGCT ATGGCTTTCTGT A AGC A A AC AC A A AG AT AGG A AC A ACTCTGTCC 
CTGGCACAGAGCAGAI'GCTCTGGCAATATCTCATGAGTGAATGAAGGCACATGACAAACCTC 
CAGACCTGTGGAGACTGAAGGCTGAGAGCCTTTATAGATGCTGTGGGGCCGAGGAGTTTGCC 
A ACT AC A GC AGG TCA TG CCCAGA G G TTTC TC T C T GG G TAGCAAGGTGTGTCTCCCAC C AAAGG 

20 CCATTGGCATGGGGCCCGCCCTGCTGACCCGAGGCAGTGCACAGCAGAGGCCAGATGCAGTG 
AG AAGGAGC C TCTCCTTGGCCTGCTGTCTGCTGCCATGCCTGTGGGGGCGTGGACACAAGTGT 

TGTGTATGTGCATGTGGGGGTGTGTGTGCATGCATGTGTGTGTGTGCATATGCACGTGTGTGC 
ATATGCATGTGTGTGCATGGAGAGAGAAGACCTCCTCITI'CTGGCCCCTCTCCTAGCTGCCCCC 

25 CTCCCTCCTGCTGCC AAC AC ACTGTC AACCCTTC ACTGTC'n Tl'l CCTTGGG ACTCGTTGATCTG 
TCTCTACCATCCCAGGTGTCTGGAGCAGCCTCTAACCTTCCATCTGCCAAGGTACTTCAGCCCC 
ACCCCTCCCAGCTGTGGAATGTCCCCTAGGATGTGCCACTGACACAAAGAGCCACACAGCTCC 
AAAATAGAATATTATCTAACCCACTGCTCCCTTTGCTGTCAGCAACACCTCCACCATGCTTCTC 
CCAGGACCCCCCrrGAACTCTCTGCTl^CCTCCCTGAGGCCAAAGGAAAGACAGGAAAGGGGCC 

30 ACCTTCCTGTCCTTGGGTCCCACAGAGATGTATCCTTGTAATGAAACCTACTTTATGCTTGAGT 
TGTATCCAGTTAG T TTC T GTGCKrrrGCAATCAAGACC C ACA C CCACCrCAACCCAGGC T CTAGA 
GAGTAGACCCTTGTTTrTGCCTGGCTTGGGTCGACCTGGCACCTGCCAGGGTCCCAGCCTCTGA 
GTCAGCCCACCTTGCCCTCATCGGTGCCACCTCCAGGCGGCTGT 

A CATAGACTCTGGCT7CTGCCCTGGCCTGGCCTCTGGGAACTGCAGCTGTCTGCTTCCATCCTA 
35 TGTGGATGGTGCCTGAAAGTGAATAGGGATCAGTTACCAGCCCAGTATCTGTCCCCTTCTCAA 
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TAGCACTGATTCCTATGGGGAACTGCTTTTCTTGGACTATGTATGGGTTTGGTGGGAGGGTAG 
TTCCTGTAAC CAA.CCCTACAGG G TG T AGGAACCTAGACTCTCAGCAACATAAC AGGCAGC 
A GG C TCCCAAGCTA AG TCT GGCCAGCTGGGCCACCTC TCCCAGA TTCTGTTTCATGAGAGCAT 
CATCCAAGAGCAGTGGGAACACTGGGGACGGTCCAGCCTAGGACTGGTATGCAGATCAGAGA 
5 ATCCCAGATAGAAGGl''GA TrC K]) TG TTCTTCCAGT1TClTGGCCCTCCAGAGCAACCATACTTCC 
CATCTGCCCCAAAACCTGATCCTCCAAACTCCCACCATTTCTGTGCATCCCCAATATCTAA 
TAGA T CA AC TG CC 1T T CA T1T A CATT TGTCACAACCAAATGATACA C CT G CCC T TCA C CCA CT A 
CTGAACTGCAGCTGGGTTAGTCCAAATTCAGGGCCCACGTGTCATTTCAAGCCTGTCTTGAAT 
AATGTACACCTTCCTGCAATGTGAGGATGGCCACCACCT T GG TCTT ATAC CC AC GGGT GT CCT 

10 GAGCTACATTrCTCATAATCAAAAATAAACTCAACACATCACTCCAGCCTGAGCAACAGA 

OC A AO A O ACHT A OOTCTTAl A AA A-I-A AAA A AT A AAAA€A AA£-AcAArI^fAKAtAAA€XyGArG€ A AACyT-I - 
GGGGAAAGAG G AAGCAC CTG ATTrCCAGAGrnrCACATCATGAGATGCAAATGTCCAGTTr 
TCAACAACAACAACAACAACAAAAAAAAAATCACAAGGCATACAAAGAAATAGGAGACTAA 
GACCCACTCAAAGGAAAAGAATAAATAAGCAGAAGCCATACCAGAGGAAAACCAGATGGCT 

15 GAC1TACTAGACAAATACTTTAAAACAACTGTCTTAAAGATGCTTGAAGAGCTAAAGGAAAA 
TG-TGAACAAAGTGAAGAA AGTGATGG AACAAA-T-GGAAATTCCAAT 
TTTTGGAGTTTTTTTTCTTGGTAGCAAAAAATTATGAAGCT 

GAGGGCTTCAAAGGCAGATGTAAGCAAACTTGGCCAGGTGCAGTGGCTCATGCTCATAATCC 
AC K^AC TT TG GA A G GCTG AGGCAGG A GG A TTGC T T G AGC C CAGGAGTTT G AAA C CA G CCT G GG 

20 CAACATAGAAAAACCCTATCTTTAAAAAAACTTATATAAAAT T TAAAAATTATAAAATTTAT 
TTAAAAAATCAGCAA1TTGAAGACTGGACAGGGAAATTATCA<\ATTTGAGGAACAGAAAGG 
AAAAAGA TGGA AGAAAAATAAACAGAGCCTAAGAGACCTGCGGGACACCATCAAGCAGACT 
AATACCCATTGTGGAAATTCCAGAAAGAAAAGAGAG T GAAG G ACCAGAGAGATTATTAGGA 
GAAATAATGGCTGAAAATGTC T CAAA'ITT G AT G AA TG ACA TGA ATATGAACATT CA AAA A TC 

25 TCGACAAACTCCAAGTAGGAAAAACTCAAAGATACTCATACTGAGATTCATCATAATCAAAC 
TGCTGAAAGCCAAAGACAAGGAGACAATATCAAAAG CTGCA AGAGAGAAGTGACTCATCAC 
ATACAAGGGATCTTCAAAAAGATTATCAGATATCTTGGCTGGGCACGGTGGCTCACACCTGTA 
M- CTTAG C AC r lT T GG GAGGC C G AGGC AGGTGG AT C AO^^ A GGTCA GGAGTT T GAGACCAGC 
CTGGCCAACATGGCAAAAACCCATCTCCATTAAAAATACAAAGATTGGTGAGGCATGGTGGT 

30 GCATGCCTGTAATCCCAGCTACTCGGGAGGCTGAAGCAGGAGAATCACTTGAACCTGGGAGG 
CGGA G GG T GC AC CAAGCCAAGATCGTGCCAC C ACTGCACTC C AGC CT GGGTGACAGAGTGTG 
ACCTTGTTTC A AAA AAA A A AG A A A A AG AAA A AG AAA AAA A AG ATC ATC AGCT ATCTC AT C A 
G A A ACCTC AG AGGCC A A A A GGC AGT AG ATTG AT AT ATTC A A A GTGCT A AA AG A A AA A A AT A 



35 ATGAAGACATTCCCAGATAAACACAAGCTGAGGGAGTTCATTATCACTAGATCTGCCCTGCAA 
AGAAA G CCAAAGAAAGCC rrTC AG GAT G A AATG A AAGGA T AC T A GACAGTG A CTC AAAG CT 
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GAATAAAGAGGCCAGGCATAGTGGCTCACACCTGTAATCTCAGCACTTTGGGAGGCTGAGAT 
GGGCGGATCACCTGAGGAGTTGGAGACCAGCCTGGCTAATATGGTGGAACCCCATCTCTACG 
AAAAATACAAAAATTAGCCAGGTGTGGTGGCACATGCCTGTAATCCCAGCTACTTGGGAGGC 
TGAGGCAA G AGAA TC AC C TGAAC C ; C AGGAGGCGGA GGTT GC AGTGAGCCGA GATTGTGCCAC 
5 CGC A CTCC AGCC'rGGGTGACAGAGTGATACCCTGTCTCAAAAAAAAAAGCCGAATAAACGAA 
TAAAGATCTCATCTATGGCCGTACCACCCTGAATG T GTCCAATCTCAGAAGCTAAGCAGAGTT 
GGG CC TGGITAG T A CTTGGA GGGGAGAAATAACGGTCTATGCTAAAGGAAAATTCAGGTGCA 
ATTA A ACT A A AATT A ATT AT AT A A A AG AG A AT AC ATTA A A AGCTAGTATT ATTGT A ACTTTG 
& TTTGTAATTCCACCAAGTGGAATTrGTTCCTGAAATGCTAGAATGGTTCAACATAAAAATCA 

10 ATAAATGTAATAGACCACATTAACAGAAAAAAAACCCACACGGTCATCTCAATTGATGTCAA 
AAAAGTATTTGACAAAATTCAACACTCTTTTGAAAGAAGAAAAAGCTCAACAAACTAAGAAT 
AGGAGGAAACTACCTCAAATAATAAAATCCATAGGCCAAATCCCCAAACTCACAGCTAGCAA 
CATATTTAATGCTAAAGACTGAAAGCTTCCCCTTTAAGATCCGGAATAAGACAAAGATGCCCA 
CgJCACCACHCTACTC 

15 AG A A ATA AAAAGCATCTG A ATTGG A AAGG A AAA AGT A AA ATTATTTGTTTGCCC AAT AC ATG 
TACAATGTTTCAGGTGAAGGCTCAGAACAGTACAACCTTACCAGCAAGAGTCCTGCTGTCTCT 
GTGTGAATCCCAGCTA1TACTCACTAGCTACATGATCTCTCTTGCCCTCCCTGCCTCAATITCCT 
CATGTGTAAAGTGGGAGAAAAATAATAGTTCATGCTTCAAAGGTTTTTTGTTTGTTTGCTTGC 
T3^ ^GACAGCGTCTCK3C TC TGTCC^TCAGGCTGAAGTGCAGTGGTCK:AATCTTAGGTCACTG 

20 CAACCTCAGCCTCCTGGGCTTAAGCGATCCTCCCACCTCGGCCTCCCAAAGTGTTGGGATACAG 
GCGTGAACCACTGTGTCTGACCCAAAGGATTATTTGAGGAGCAGATGAATTAATGTGTCATA 
ACCTCAAAGCAGTTGCAAAGGCGTTTAATAATTAAAATATCACATTTTAAATTAAAATATAA 
GGCTGGGCGTGGTGGCTCATGCCTGTAATCCCAGCACTTTGGGAGGCTGAGGTGGGAGGATC 
AC'rrGAGCCCAGGAGITCCACACTAGCCTGGGCACCATI^GGGAGACCCTGTCTCTACACACAC 

25 ACGCACACACACACACACACACACAAACTTAAAGTAGCCAGGCGTGGTGCTGCGCGCCTGTTG 

TGAGC CGAG ATCGCACCACTGCACTCCAGCCTGGGCCACAGAGCAAGACG CT GC C TCAAACAA 
ACAAA€AAAAA€AAAAA«AAAAmHAAGm^ 

GTGGAGACAAGACCTGGACTTAGGAAACAGGCCCAGGGAAGTAGCAGAACAGTAGCGCTAG 
30 AGGACGCCTGGGAGAATCAGCGCGCGGCGGGAAGAGCCCGGGAAGCTTAGTGGGGAAGCGT 

TCCCAAAGTGGGAGACAGCACTCAGAAAGACGTGGTGGTAAGAACGAGTATGAGTAACGGG 
GACAA C GAG G ACACTGGAGATTGGGGAGTGTT G GGCTGGAAGCTGGTGTGCA GCT GTGGGCA 
AGCTAGGGAGGACCCCGAAACCGCCAATGCGTTTCCCG GA C GCAG ACGCTGGCAGGACGGGA 
35 GGAACCCCGAGACCCCGCGCCATCCCTTCAGGAAGAGTTACTTCTCCCCGGCCAAGTTAGTGG 
GCCITGGGCCririTTCTGITGGGATCCTCCTCGCGTGTCGCCATCGCTA 
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tgcggggaaagctgggacgctgggggcttcaccaaggaggctggcggccgaccactgggagg 
t ctg gc gg g gtgacgac c actgggaggtttgggcagggcctgacgg ggtg a cgc gg tca g cc 
cactggaggccgacaccccccgtcagcccaacccctgcacgcgcggccgccaaccaaagaccc 
gcggcgccggcctgcgagcccccgccccgcgttgcccaggaaaccgagggtgtggctccgcgt 

cggggcgaggagggcagcgattggtgagattaggcgatgggcggggaagccgcgcggggat 
tagcgagtrgcggcgatgggcggggcaggcgcgcggggatrggcgggatgcggcgcgccgcg 
cgttgagtggggtccagggaaacggggtcagctgggggtggcagttccaggccgcgaggccg 
ggctcctgggtcggtgggctggtgtcttggcggacgtcccgcagctgccgcgtggatccgagc 

10 cggggcacccgccgtgactgggacagcccccagggcgctctcggccccatcccgagtagcgcg 
gcctggctgctgccgccatca/ x gcac gttc gagccaaaagctcctaacgagtcactcgttaga 
cacgtgtgcggagccrrgtgtcccaggccagtgctgtcccgtggagatagattgcaagccgcta 
gggaattttttaactttctagtaggtgtacgaaaaaagtaaaacgaaacaaa t caattggag 
taaatc c ata aatatattcaaactattatttcaattgtatgtgaaaaaattattgggatattc 

15 tttgt act attcttag a aatcc attgtgtgtcc a accc a a ac atc ac agttgg actc acc ac at 
ctcctgtacti"cgtagccctaggtggctagtggcataagacacai\aaatctcagctctcctgg 
agcttatggtctagitggagcaggcagacaatacatttaaaatatacagiitgttagaaggt 
aaatgttgtaaacaacaataacagttgaagta'ctggggagagttgcagttgtaaatcagatg 

20 gctctataagtatacgggagaggggcaagcaagagttcagaggccccttgctgtggggaggg 
atccaaggtggaggagtgggaaccaggaggggagaggaccagtggagcagatctcataggc 
actt^4a^a€hggg gccttattcaatgaaatgagg 

gcagtgactgatttatgttttggttttggtttagttctattattatttaataataggcttatta 
trtcacagaagttttatttaataaggcagacctcttgtctggaaatgagacaggtgccg 

25 agctggatggaggcagatcgggaattccatttggggcaaactgaacttgattgagaccctgg 
t^mot-gto ^ta tggaac aggacacc tg agtcta g gg t tcgggaagaactccag a 
aacactcctagctttccttttctctttttggatgaccgctacagggtgagacatcggtatccag 
gcacgataaatttccaagtggacacaatgtctggtgtcaactacagctgttctccttcttttcc 
cagtatcctrrgggtgcagtgagacaccaggagagctgctgctitgggggatggacaggggc 

30 agcaggaatgcctrtgtgttttcgcagtgaacctccttggcctgggcgaagctgtgtggacca 
ag c aag tc aggag t gtgck:c a tgttttctga gc ag gctg cccagaggcm:ccac a ctc t act g t 
ccy:caccatcagccaacaatgcx\^cctttgcccgggtgccagtggcaacctacaccaactcctc 
acaacccttccggctaggagagcgcagctttagccggcagtatgcccacatttatgccacccg 

€CT4^X^AAT45AGAC€ Cm:CTC^ 
35 TGGGAAGGTGCTTCCCCCACAGCATCCCTGAACTTAGAAGTGTTCTGCAAGAGAATGGGA/\C 
AGTn'ATCT A ATTGATCCCACTrCCTGTTACCTTGGGAAAAITAACCTCTIlTTCCCTCAGTIT 
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CTTCTTAAGATAGTAACAAGGATTAAATTAAGTAATTTGTGGGTTTGGAGTTAGTTTTAGTTC 
AGAGGCTGGTTGGAGATGAGGACTTAGTTCTGGCGGTGATGGCGATTACTTCACTGGCAGAG 
GAAAATGGTTTTCCTATCTTCAGTGCAGATTATTCAGGTATTTGCCTGTGCTGTAGCCAGAGA 
GCCCCTCAGTGTGGCAAGCCTGGCGCCAGGCACCAGGAGCCAAGACTGGTGAGGATGCACTCT 
5 C TGGTCTCGAGGGGACCCCCTCTG r ITCACTCATGTCTGTTrGCCTCTCCTCCTGGCCCCCATA1T 
TCCTGGCCATGAATTTTCCTGTCCCTTGGGCCCTCTGTCTTTCCTAATAAAGTGGCCTGCCCAA 
CACAACCCTTGTrCTTTGCCCCCATTTCTTCCCTGGTGATCTCTCCn^ 

GTGGTGAAGCAGGGACCCCCATCTCCCCCTTTGAGTTTATTTGAGTTTTAGGTGCTGCTGCATT 
CCCC C ATTCCT A CCACTTACATA AGA GTGC KT riTCCAGGTAA TTTTCAAATCCATC 

10 TATTTTTA AACTGAGGATTTAGTAGGTG A GA C CAGGTCTTACTCATTTTTACTGTCCTTGGCA 
CCAGGCAAAATGGATCTCAGCCCTAGTTGCACATTGGAATCCCCTGGGGAGCTTTGAGAAGCC 
CATCTCATCCCATGCCAAGCCAAGATCAATrCTCG r rTATAGGCAGGCGGAGAACCCTGGGCCT 
AGAAATCTAGCTAGAACCTCAAATTCATTAGGGATATGTATTAGTCCATTTTCACATTGCTAT 
AAAAAACTACCTGAGATAGGGTAATTTATAAAGAAAAGAGGTTTAATTGACTCACAGTTCCT 

15 CATGGCTGGGGAGGCCTCAGGAAACTTAACAATCATGGCAGAAGGTGAAGGGAAAGCAAGG 
CTCTTTTACAJGATAGCAGGAGAGAGAGAGCAAGGGGAACTGCCAACCATTTTTAAACCATC 
AGATCGCATGATGGCTrGATCTCACTCACCATCACAAGAACAGCATGGGGGAAATCCACCCCC 
ACAATCCAGTCACCTCCCACCAGGTCCCTCCGTCAACACCGTGTGGATTATAATTCCAGATGA 
GATG T G GGT G GGGACACAGAGCCAAATC A TATCAGGATGTTTTCTGTTTTGTTTACCTGAGAC 
.20 AAAGTGCTGTrCACCTCTCCTCTCCCACATAATCAGGGGCTCCCTCCTGCGGCTCCGGTAGCTT 
TTCCTCACTTTCCTrTCAGCCCTCGGGACACCTTCCTlX}^ 

TGGGCC C AATGTCAA T G CC ACCTTC T AGATTCTTTCCGGCAGCAC C TC CT CTGGTCGCACATTT 
CTCTTCCAGTTATTGGAGCTGTCAAAAAAGCTCCCCAGTGATGGACGATAGCGATTTCACTGT 
GCTCACAGAC r rGGTCAGGAAACCAAACAGCTGCCACAGTGAATGTGTrGATAGCAGCGGGGC 

25 AGCAGTAGCACTCGCTCACAGGCCTGGTGGTTGGTGCTGGCCCCCACCCTGAATACCTACATG 
TGGCTTCTCCATGTGGCCTGTGCATCCTCACTGAAGCTCAGCCTGTCTCTCCAAATTGGTCTTT 
CCACTCACCTGTTCCCCAAACCTGCCCAGACCTTCCTGCTGTAGGCTTTTCCCTTCACTTGGCAC 
A C TCTTTCCC1TG T CTTC C CATGGC C CCATCTAAGCCCCACTGTCAG CTG AAG T GTTATATTCTT 
TGAGGGGCCACCTGAAGCCACCITGCAA T GAGGGCCTCCGlTlTCTACCTCAGCTCACCATTrG 

30 TTCACAGCACTTGT C A C TGTGGCGAGTTACTTGTCTATGGCCTGTTGTCGTTCTCCTGCCTAGA 
CC C AGTGGG C TGAGTGGGGGC A AG T GTTGGC TT T T ATGTCCAGTTTTGATCT^ 
CATTGCCTGGGTGGAAGCATGTCCTACTATCGGTTACAGGGATGTCATTCTGCCCAGTGCTCA 
GGG GCATA C^CH^GAKCCAGTO 

G TTGGGTGATAATATCTACTCCTGGCACATTTTCAGCGTTGGCTGAGTTACATTACAGTGCTT 
35 AGGCCACCTGGGGGAGAGTAAGAGTGGGATACGTGAGGATGTGGAGTCTGTTGCATTTCTGT 
CTGCTGCTGGCATCCTTCTTGTCTrGTTTTGAGTTGCTCGCCTCTGTCTGCTCCCTAGGGCGTA 
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gatttgaggaatattcctggttcttcccaggcagcaggggctcaggctgtgctggagtcagct 
aggct aa g g g gc t g g tc t g gc a t ccgcgttgtcctgt c acc t ccttggtgttttctccaggcct 
ggatctgtgctgtgtgggc:acctgtattcctccctcctgccctcactgattctccatacctttct 
tctcgagagtgccaagcccctcccatgtgttc r rtgttcatacctaggatcccgggaaggggct 

5 GGGGAAGACGGTGCCCAGGTGCCCTGflGTAAACAAAGCCACCTGACTCCACGGG^ATGGA^T 
GGGTGGAGGGGATCTGAGGTCTGCATTTTGAGTATCTCTGGTCTCAGAGGATGAAGCATTTG 
GTGGGGGTTGGGGGTGGGGGGTAGGGTGGAAGAATCTAAAGTCTTAAAAGAAAATGGCAGT 
TATTTGTGGGACAGGGCTGTGTTGAGACTTGGCATGCTTCTTTTTAAGAGTCAGTGTTGTAAT 
T- TAGGT ATA A GTG A A GC AGT ACTTTGT ATTAGTTTCCTGT AGGCGCTGTA AC A A AGC ACC AC A 

10 AACTGGTTGACTTAAAACAACAGACATGGCCGGGCACGGTGGCTCACGACTGTAATCCCAGC 
ACTTTGGG A GGCCGAGGC GGGC AG ATC AC A AGGTC A AG AGATTG AG ACC ATCCTGGCT AAC A 
CGGTGAAACCCTGTCTCTACTAAAAATACAAAAAAAAAAAAAITAGCTGGGCGT GGTGGCAC 
ACGCCTGTAGTCCC AGC T AC TCGGGAGGC TGAGGCAGG AGAATGG CGTGAACCCGGGAGGCG 
GAGCTTGCAGTGAGCTGAGATCGCGCCACTGCACTCCAGCCTGGATGACAGCGAG ACTCCGCC 

15 TCAAAACAAAAACAAAAACAGAAACAACAATAACAGAAAAACACAGACATTTACTCTCTGGC 
AG-TXCTG GAGGCCAGAAGTTGAAATCCAGATGTCAG C A GGATTG GCTCCTTCTGAAGGCCCG 
AGGGGAGGGTCCTTCCTGGCCTCCTCCCTGGTGTTCCTGGGCTTGTGGCCGCATCACTCCGCTC 
TGCCCGTCTTCACACTCCCTCTTGTCTGTGTGTCTGTCTCTCTGTTCTCATGAGGACACTTGGCA 
TC C AG G GCCCAA C CACACC C AGAGTCCC T GGTCTCCTGTGGC T GAC T CACTTTTTACT GTC 

20 GTGAAGTCCAGGGGGTCCTTGTACTTGATGTTCTCTCCTGGCAAGGCCAGGGCCCTGTGATTG 
GCCTCTCATGGAGTGCTGGGCAGGGCCTCCATGGCCTCTGTCGGGCGGGGGGGCTACTTCATC 
TCTGAGTCTGTACCCCTCGTGTCCCAGGCAGTGGAGTGGGAGTGAAGAAGCTGTGTGAACTGC 
AGCCTGAGGAGAAGTGCTGTGTGGTGGGCACTCTGTTCAAGGCCATGCCGCTGCAGCCCTCCA 
TCCTGCGGGAGGTCAGCGAGGAGGTGAGGCAGGGTGCTACACAGTGGGGCCGCCAGGCAGAC 

25 CT GGCCTCCCACTAGAACACCTCCCTGGAGGT GGGGTT GTGGGGAAGCAGGTTCAGAGACAA 
T-GGACTCCAGAGGGG1^G G GGGCT G CGG TGCCAGCTC ACTAAC A C CAGAGCTTTGGTGGGCTCT 
GGCCCCAAGATTATACCTCCTGTCTCTGCATTCCAGCACAACCTGCTCCCCCAGCCTCCTCGGA 
GT AAATACATACACCCAGATG ACG AG CT GGTCTTGGAAGATGAACTGCAGCGTATCAAACTA 
AA A GG CAC C ATTGA C G TGTC AAAGCTGG ITACG GG T AG GG A GCCCAA TGAGA GG A TGTGGGX 

30 GATGCAGGTGAAGAGCCCAGCGGTGGTGTGTTAGGGATGGTGTGAGTGGGGAGCCTGGGGG 

GTCGGAGGCCATCAGATTGGGTGAGACCTGGCTGGGAGATGGGTCTCCCCACCTCCATCCAAG 
GGCAG T GACTCCAGGAAGCAGGCATGCATCCTGGAGTCCTAGGTGAGAATTCACCAATGTGG 
TTGTGGAGAACTGGCTTGTTTTGCCCGTTGGGGTGACTGGAAGGAGTGGTAGCACCTGGGGC 
35 TCCCTGCTCAGGCCTGATGCCACTGCTCCCCAGGGACTGTCCTGGCTGTGTTTGGCTCCGTGAG 
A G AC G A C GGGAAG T TT CT GGTGGAGGACTATTGCTTTGCTGACCTTGCTCCCCAGAAGCCCGC 
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ACCCCCACTTGACACAGATAGGTGAGCAGCAGTTCTCGGGAGCTGGAACCAGCTCATGGTCAG 

CTTCTCCCGCAGGTTTGTGCTACTGGTGTCCGGCCTGGGCCTGGGTGGCGGTGGAGGCGAGAG 
CCTGCTGGGCACCCAGCTGCTGGTGGATGTGGTGACGGGGCAGCTTGGGGACGAAGGGGAGC 
5 AGTGCAGCGCCGCCCACGTCTCCCGGGTrATCCTCGCTGGCAACCTCCTCAGCCACAGCACCCA 
GAGCAGGGATTCTATCAATAAGGTATGGAGCCCACCTGGCTGCATTCAGCCCCAGCCCAGGAG 
CCTGCAAGCCTGTAAGACCCTCCTTCCCCAGGGCGAGTAGGGTACCCTGTGAGGTCTCGCAGG 
TCGGTGGGAAGCGCCCTGCAGTGACTCTGGGGCCTCCTGCAATGGGGCTCCTCATGCCCAGGC 
C CTCGCTGAGGATGGTGGGAGGCTTGAAGGGAGTGAGGGTCTATGGGACAACAACTGCATCT 

10 TCCAGCTGGTGGGGCTCTACTCTCCTC T GAGCCTGGGACTCGCCTGGGCCTGATGGCCTTCTGG 
GCTTCTATTCCAGGCCAAATACCTCACCAAGAAAACCCAGGCAGCCAGCGTGGAGGCTGTTAA 
GATGCTGGATGAGATCCTCCTGCAGCTGAGCGTGAGCGAGCTGGGGGCTGGAGGGGTGATGG 
GGATTGCAGTCTTCAAAGCTGCCACTGGGCAACAGAAGGCAGGCAGGAGGGCAGGGGGAGT 
GGCCGGAGTTGGTGTAGGGGGCTCC1TCGGGGCCCTGTGAGCTCTCCCTGCCCTGTGCCTTCCA 

15 GGCCTCAGTGCCCGTGGACGTGATGCCAGGCGAGTTTGATCCCACCAATTACACGCTCCCCCA 
GCAG CCCCTCCACCCCTGCATG^rTCCCGCTGGCCACTGCCTACTCCACGCTCCAGCTGGTCACC 
AACCCCTACCAGGCCACCATrGATGGAGTCAGGTAGCTGGCACAG C CACACITCAGTCTGACC 
CACjCCTTTTGCCTCAGGAGGCACAAAGAAGGGAGGGGAGGGAGGGCCCAGGAAGGTGGCAG 
GGGTOGAGAGG€€GA4X^AGCA-1^ 

20 TGAGCTCTGGGCTGACCACTATGGGTGGCACCCAAAGCCAAGAGTCAGCTGAGCTTTGCCTTG 
CAGATTTTTGGGG A CA T CAGGACAGAACGTGAGTGACATTTTCCGA T ACA G CAGCATGGAGG 
ATCACTTGGAGATCCTGGAGTGGACCCTGCGGGTCCGTCACA T CA G CCCCACAGCCCCGGACA 
CTCTAGGTAACAGGCTCAGCCATACAGGGTGGGAGCAGAGGGCCAGGAGGCCTGGCAGGACC 
CTGAAGTGCACAGGGl'CCCCCTGTGGGITTGCACTrGCCAGCATI'GCTGAGAACTGTCTGAGG 

25 AGAAGTTCAGAGGCTTGGCACCTGCTCTGGAAGCTACTCTGGAATCTTAATTCTAAGGCCAAT 
GGCTGCCCACCCCAACGGGCAGCAACAGCAGGGCCAAGGTCTTGTGACAATGTCTGGAGGTG 
CCCCTATTGTCACACTGGGGGTCTCCTACTGGCCTGCAATGGGAGGAGGGGCTGCAGCCCCAC 
ATCCTGTOCA CfAGTGCTAGTGCTGAGGCGGAA CCCTC CTCAGAGCTGCCCCTTCTCCTCTAGGT 
TGTTACCCC1TCTACAAAACTGACCCGTTCATCTC 

30 GCAACACCGCCAGCTTTGGCTCCAy\AATCATCCGAGGTAATTTTTGTCTTCTGGGGGCCCAGG 
CrGAIXlX^l^ATTTC 

AGTCCCCCTGGCCCTTGTGGGATGGACAGCTGAGGTCTTCTGCACAGCTGCCATTTCACTGTG 
GGAGCCAAGCTGCCTCGCCAGCTGGGCAGGGACTGGAACGGCTCCCAGCCTGTGTGCCTCTCA 
AGGCTAATCTCTGGTCTCCTATTGTCACTGCCCCACTGTGTGCCAATGGGGACTCCTGTTTATT 
35 TCTGGCAGCTrCTCTTTGAGGGAGGACTTACTTGGAACCTACAGTGGGTCCTATGTGACTTCTT 
TGCAGGTCCTGAGGACCAGACAGTGCTGlTGGTGACTGTCCCTGACri^CAGTGCCACGCAGAC 
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CGCCTGCCTTGTG 
GAGGACGAT6 A CCTGGG A G 

AGAGGCCCAGATGGAGGCTGTTCATTCCCTGCAGTGTCGGCATTGTAAATAAAGCCTGGCACT 

5 TGACCATAGCCCATGCACCCACCAGCCGGTCTCCCT 

The POLD2 gene is 19,000 base pairs in length and contains ten exons (see Table 14 below for 
location of exons). As will be discussed in further detail below, the POLD2 gene is situated in genomic 
clone AC006454 at nucleotides 119,001-138,000. 
10 The polynucleotides of the invention have at least a 95% identity and may have a 96%, 97%, 

98% or 99% identity to the polynucleotides depicted in SEQ ID NOS:5, 6, 7 or 8 as well as the 
polynucleotides in reverse sense orientation, or the polynucleotide sequences encoding the SNARE 
YKT6, ABBPh h uman glucokinase. AE BPL or POLD2 polypeptides depicted in SEQ ID NOS:l, 2, 3, 
or 4 respectively. 

15 A polynucleotide having 95% "identity" to a reference nucleotide sequence of the present 

invention, is identical to the reference sequence except that the polynucleotide sequence may include 
on average up to five point mutations per each 100 nucleotides of the reference nucleotide sequence 
encoding the polypeptide. In other words, to obtain a polynucleotide having a nucleotide sequence at 
least 95% identical to a reference nucleotide sequence, up to 5% of the nucleotides in the reference 

20 sequence may be deleted or substituted with another nucleotide, or a number of nucleotides up to 5% 
of the total nucleotides in the reference sequence may be inserted into the reference sequence. The 
query sequence may be an entire sequence, the ORF (open reading frame), or any fragment specified 
as described herein. 

25 
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As a practical matter, whether any particular nucleic acid molecule or polypeptide is at least 
95%, 96%, 97%, 98% or 99% identical to a nucleotide sequence of the presence invention can be 
determined conventionally using known computer programs. A preferred method for determining the 
best overall match between a query sequence (a sequence of the present invention) and a subject 
sequence, also referred to as a global sequence alignment, can be determined using the FASTDB 
computer program based on the algorithm of Brutlag et al. (Comp. App. Biosci. (1990) 6:237-245). 
In a sequence alignment the query and subject sequences are both DNA sequences. An RNA sequence 
can be compared by converting ITs to T's. The result of said global sequence alignment is in percent 
identity. Preferred parameters used in a FASTDB alignment of DNA sequences to calculate percent 
identity are: Matrix=Unitary, k-tuple=4, Mismatch Penalty=l, Joining Penalty=30, Randomization 
Group Length=0, Cutoff Score= 1, Gap Penalty =5, Gap Size Penalty =0.05, Window Size=500 or the 
length of the subject nucleotide sequence, whichever is shorter. 

If the subject sequence is shorter than the query sequence because of 5' or 3' deletions, not 
because of internal deletions, a manual correction must be made to the results. This is because the 
FASTDB program does not account for 5' and 3' truncations of the subject sequence when calculating 
percent identity. For subject sequences truncated at the 5' or 3' ends, relative to the query sequence, the 
percent identity is corrected by calculating the number of bases of the query sequence that are 5' and 
3' of the subject sequence, which are not matched/aligned, as a percent of the total bases of the query 
sequence. Whether a nucleotide is matched/aligned is determined by results of the FASTDB sequence 
alignment. This percentage is then subtracted from the percent identify, calculated by the above 
FASTDB program using the specified parameters, to arrive at a final percent identity score. This 
corrected score is what is used for the purposes of the present invention. Only bases outside the 5' and 
3' bases of the subject sequence, as displayed by the FASTDB alignment, which are not 
matched/aligned with the query sequence are calculated for the purposes of manually adjusting the 
percent identity score. 

For example, a 95 base subject sequence is aligned to a 100 base query sequence to determine 
percent identity. The deletions occur at the 5' end of the subject sequence and therefore, the FASTDB 
alignment does not show a matched/alignment of the first 10 bases at 5' end. The 10 unpaired bases 
represent 5% of the sequence (number of bases at the 5* and 3' ends not matched/total numbers of 
bases in the query sequence) so 5% is subtracted from the percent identity score calculated by the 
FASTDB program. If the remaining 95 bases were perfectly matched the final percent identity would 
be 95%. In another example, a 95 base subject sequence is compared with a 100 base query sequence. 
This time the deletions are internal deletions so that there are no bases on the 5' or 3' of the subject 
sequence which are not matched/aligned with the query. In this case the percent identity calculated by 
FASTDB is not manually corrected. Once again, only bases 5' and 3' of the subject sequence which are 
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not matched/aligned with the query sequence are manually corrected for. No other manual corrections 
are made for purposes of the present invention. 

A polypeptide that has an amino acid sequence at least, for example, 95% "identical" to a 
query amino acid sequence is identical to the query sequence except that the subject polypeptide 
sequence may include on average, up to five amino acid alterations per each 100 amino acids of the 
query amino acid sequence. In other words, to obtain a polypeptide having an amino acid sequence at 
least 95% identical to a query amino acid sequence, up to 5% of the amino acid residues in the subject 
sequence may be inserted, deleted, (indels) or substituted with another amino acid. These alterations of 
the reference sequence may occur at the amino or carboxy terminal positions of the reference amino 
acid sequence or anywhere between those terminal positions, interspersed either individually among 
residues in the referenced sequence or in one or more contiguous groups within the reference 
sequence. 

A preferred method for determining the best overall match between a query sequence (a 
sequence of the present invention) and a subject sequence, also referred to as a global sequence 
alignment, can be determined using the FASTDB computer program based on the algorithm of Brutlag 
et al. (Com. App. Biosci. (1990) 6:237-245). In a sequence alignment, the query and subject sequence 
are either both nucleotide sequences or both amino acid sequences. The result of said global sequence 
alignment is in percent identity. Preferred parameters used in a FASTDB amino acid alignment are: 
Matrix=PAM 0, k-tuple=2, Mismatch Penalty=l, Joining Penalty=20, Randomization Group 
Length=0, Cutoff Score=l, Window Size=sequence length, Gap Penalty=5, Gap Size Penalty=0.05, 
Window Size=500 or the length of the subject amino acid sequence, whichever is shorter. 

If the subject sequence is shorter than the query sequence due to N- or C- terminal deletions, 
not because of internal deletions, a manual correction must be made to the results. This is because the 
FASTDB program does not account for N- and C- terminal truncations of the subject sequence when 
calculating global percent identity. For subject sequences truncated at the N- and C-termini, relative to 
the query sequence, the percent identity is corrected by calculating the number of residues of the 
query sequence that are N- and C-terminal of the subject sequence, which are not matched/aligned with 
a corresponding subject residue, as a percent of the total bases of the query sequence. Whether a 
residue is matched/aligned is determined by results of the FASTDB sequence alignment. This 
percentage is then subtracted from the percent identity, calculated by the above FASTDB program 
using the specified parameters, to arrive at a final percent identity score. This final percent identity 
score is what is used for the purposes of the present invention. Only residues to the N- and C-termini 
of the subject sequence, which are not matched/aligned with the query sequence, are considered for the 
purposes of manually adjusting the percent identity score. That is, only query residue positions outside 
the farthest N- and C-terminal residues of the subject sequence. 
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The invention also encompasses polynucleotides that hybridize to the polynucleotides depicted 
in SEQ ID NOS: 5, 6, 7 or 8. A polynucleotide "hybridizes" to another polynucleotide, when a 
single-stranded form of the polynucleotide can anneal to the other polynucleotide under the 
appropriate conditions of temperature and solution ionic strength (see Sambrook et al., supra). The 
conditions of temperature and ionic strength determine the "stringency" of the hybridization. For 
preliminary screening for homologous nucleic acids, low stringency hybridization conditions, 
corresponding to a temperature of 42°C, can be used, e.g., 5X SSC, 0.1% SDS, 0.25% milk, and no 
formamide; or 40% formamide, 5X SSC, 0.5% SDS). Moderate stringency hybridization conditions 
correspond to a higher temperature of 55°C, e.g., 40% formamide, with 5X or 6X SCC. High 
stringency hybridization conditions correspond to the highest temperature of 65°C, e.g., 50 % 
formamide, 5X or 6X SCC. Hybridization requires that the two nucleic acids contain complementary 
sequences, although depending on the stringency of the hybridization, mismatches between bases are 
possible. The appropriate stringency for hybridizing nucleic acids depends on the length of the nucleic 
acids and the degree of complementation, variables well known in the art. The greater the degree of 
similarity or homology between two nucleotide sequences, the greater the value of Tm for hybrids of 
nucleic acids having those sequences. The relative stability (corresponding to higher Tm) of nucleic 
acid hybridizations decreases in the following order: RNA:RNA, DNA:RNA, DNA:DNA. 

Polynucleotide and polypeptide variants 

The invention is directed to both polynucleotide and polypeptide variants. A "variant" refers 
to a polynucleotide or polypeptide differing from the polynucleotide or polypeptide of the present 
invention, but retaining essential properties thereof. Generally, variants are overall closely similar and 
in many regions, identical to the polynucleotide or polypeptide of the present invention. 

The variants may contain alterations in the coding regions, non-coding regions, or both. 
Especially preferred are polynucleotide variants containing alterations which produce silent 
substitutions, additions, or deletions, but do not alter the properties or activities of the encoded 
polypeptide. Nucleotide variants produced by silent substitutions due to the degeneracy of the genetic 
code are preferred. Moreover, variants in which 5-10, 1-5, or 1-2 amino acids are substituted, deleted, 
or added in any combination are also preferred. 

The invention also encompasses allelic variants of said polynucleotides. An allelic variant 
denotes any of two or more alternative forms of a gene occupying the same chromosomal locus. 
Allelic variation arises naturally through mutation, and may result in polymorphism within 
populations. Gene mutations can be silent (no change in the encoded polypeptide) or may encode 
polypeptides having altered amino acid sequences. An allelic variant of a polypeptide is a polypeptide 
encoded by an allelic variant of a gene. 
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The amino acid sequences of the variant polypeptides may differ from the amino acid 
sequences depicted in SEQ ID NOS:l, 2, 3 or 4 by an insertion or deletion of one or more amino acid 
residues and/or the substitution of one or more amino acid residues by different amino acid residues. 
Preferably, amino acid changes are of a minor nature, that is conservative amino acid substitutions that 
do not significantly affect the folding and/or activity of the protein; small deletions, typically of one to 
about 30 amino acids; small amino- or carboxyl-terminal extensions, such as an amino-terminal 
methionine residue; a small linker peptide of up to about 20-25 residues; or a small extension that 
facilitates purification by changing net charge or another function, such as a poly-histidine tract, an 
antigenic epitope or a binding domain. 

Examples of conservative substitutions are within the group of basic amino acids (arginine, 
lysine and histidine), acidic amino acids (glutamic acid and aspartic acid), polar amino acids 
(glutamine and asparagine), hydrophobic amino acids (leucine, isoleucine and valine), aromatic amino 
acids (phenylalanine, tryptophan and tyrosine), and small amino acids (glycine, alanine, serine, 
threonine and methionine). Amino acid substitutions which do not generally alter the specific activity 
are known in the art and are described, for example, by H. Neurath and R.L. Hill, 1979, In, The 
Proteins, Academic Press, New York. The most commonly occurring exchanges are Ala/Ser, Val/Ile, 
Asp/Glu, Thr/Ser, Ala/Gly, Ala/Thr, Ser/Asn, Ala/Val, Ser/Gly, Tyr/Phe, Ala/Pro, Lys/Arg, Asp/Asn, 
Leu/Ile, Leu/Val, as well as these in reverse. 

Noncoding Regions 

The invention is further directed to polynucleotide fragments containing or hybridizing to 
noncoding regions of the SNARE YKT6, AEBP1, human glucokinase and POLD2 genes. These 
include but are not limited to an intron, a 5' non-coding region, a 3' non-coding region and splice 
junctions (see Tables 1-4), as well as transcription factor binding sites (see Table 5). The 
polynucleotide fragments may be a short polynucleotide fragment which is between about 8 
nucleotides to about 40 nucleotides in length. Such shorter fragments may be useful for diagnostic 
purposes. Such short polynucleotide fragments are also preferred with respect to polynucleotides 
containing or hybridizing to polynucleotides containing splice junctions. Alternatively larger 
fragments, e.g., of about 50, 150, 500, 600 or about 2000 nucleotides in length may be used. 

Table 1: Exon/Intron Regions of Polymerase, DNA directed, 50kD regulatory subunit (POLD2) 
Genomic DNA 

EXONS LOCATION ( nucleotide no.) 
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(Amino acid no.) 



1. 11546 11764 

1 73 

2. 15534 15656 

74 114 

3. 15857 15979 

115 155 

4. 16351 16464 

156 193 

5. 16582 16782 

194 260 

6. 17089 17169 

261 287 

7. 17327 17484 

288 339 

8. 17704 17829 

340 381 

9. 18199 18303 

382 416 

10. 18653 18811 

417 469 



'tga' at 18812 - 14 
Poly A at 18885 - 90 

Table 2: AEBP1 (adipocyte enhancer binding protein 1), vascular smooth muscle-type. Reverse 
strand coding. 



EXONS 



LOCATION ( nucleotide no.) 

(Amino acid no.) 
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21. 1301 1966 

1158 937 

20. 2209 2304 

936 905 

19. 2426 2569 

904 857 

18. 2651 3001 

856 740 

17. 3238 3417 

739 680 

16. 3509 3706 

679 614 

15. 3930 4052 

613 573 

14. 4320 4406 

572 544 

13. 4503 4646 

543 496 

12. 4750 4833 

495 468 

11. 5212 5352 

467 421 

10. 5435 5545 

420 384 
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9. 6219 6272 

383 366 

8. 6376 6453 

365 340 

7. 6584 6661 

339 314 

6. 7476 7553 

313 288 

5. 7629 7753 

287 247 

4. 7860 ™ 7931 

246 223 

3. 8050 8121 

222 199 

2. 8673 9014 

198 85 

1. 10642 10893 

84 1 



Stop codon 1298 - 1300 
Poly A-site 1013 - 18 

Table 3: Glucokinase 

EXONS LOCATION ( nucleotide no.) 

(Amino acid no.) 

1. 20485 20523 

1 13 
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2. 25133 25297 

14 68 

3. 26173 26328 

69 120 

4. 27524 27643 

121 160 

5. 28535 28630 

161 192 

6. 28740 28838 

193 225 

7. 30765 30950 

226 287 

8. 31982 32134 

288 338 

9. 32867 33097 

339 415 

10. 33314 33460 

416 464 

Stop codon 33461-3 



Table 4: SNARE YKT6 . Reverse strand coding. 

EXONS LOCATION ( nucleotide no.) 

(Amino acid no.) 

7. 4320 4352 

198 188 
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6. 5475 5576 

187 154 

5. 8401 8466 

153 132 

4. 9107 9211 

131 97 

3. 10114 10215 

96 63 

2. 11950 12033 

62 35 

1. 15362 15463 

34 1 
Stop codon at 4817 - 19 
Poly A-site: 4245 - 4250 

TABLE 5: TRANSCRIPTION FACTOR BINDING SITES 

BINDING SITES SNARE YKT6 GLUCOKINASE POLD2 AEBP1 

AP1FJ-Q2 11 11 

AP1-C 15 .15 7 6 

AP1-Q2 9 5 
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AP1-Q4 


7 






4 


AP4-Q5 


36 




5 


43 


AP4-Q6 


17 






23 


ARNT-01 


7 






5 


CEBP-01 


7 








CETS1P54-01 


6 








CREL-01 


7 








DELTAEF1-01 


64 


12 


5 


50 


FREAC7-01 




4 






GATA1-02 


19 








GATA1-03 


12 






6 


A m A £\ A 

GATA1-04 


25 


6 






GATA1-06 


8 


5 






GATA2-02 


10 








GATA3-02 


5 








GATA-C 


11 


6 






GC-01 








4 


GFII-01 


6 








HFH2-01 


5 








HFH3-01 


10 








TTTTITTO /\ *f 

HFH8-01 


4 








IK2-01 


49 






29 


LMO2COM-01 


41 


6 




27 


LMO2COM-02 


31 


5 




7 


LYF1-01 


10 


13 


6 




MAX-01 


4 








MYOD-01 


7 








MYOD-Q6 


32 


19 


7 


12 
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MZF1-01 


99 


A C\ 

40 


15 


94 


NF1-Q6 


5 






7 


NFAT-Q6 


43 


8 


7 


8 


NFKAPPAB50-01 




4 






NKX25-01 


13 


14 


5 




NMYC-01 


12 






8 


S8-01 




30 


4 




SOX5-01 


21 


20 


4 


4 


SP1-Q6 








8 


SAEBP1-01 


4 








SRV-02 


5 








STAT-01 


6 








TATA-01 


8 








TCF11-01 


47 


28 


5 


19 


USF-01 


12 


8 


6 


8 


USF-C 


16 


12 


12 


8 


USF-Q6 


6 









In a specific embodiment, such noncoding sequences are expression control sequences. These 
include but are not limited to DNA regulatory sequences, such as promoters, enhancers, repressors, 
terminators, and the like, that provide for the regulation of expression of a coding sequence in a host 
cell. In eukaryotic cells, polyadenylation signals are also control sequences. 

In a more specific embodiment of the invention, the expression control sequences may be 
operatively linked to a polynucleotide encoding a heterologous polypeptide. Such expression control 
sequences may be about 50-200 nucleotides in length and specifically about 50, 100, 200, 500, 600, 
1000 or 2000 nucleotides in length. A transcriptional control sequence is "operatively linked" to a 
polynucleotide encoding a heterologous polypeptide sequence when the expression control sequence 
controls and regulates the transcription and translation of that polynucleotide sequence. The term 
"operatively linked" includes having an appropriate start signal (e.g., ATG) in front of the 
polynucleotide sequence to be expressed and maintaining the correct reading frame to permit 
expression of the DNA sequence under the control of the expression control sequence and production 
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of the desired product encoded by the polynucleotide sequence. If a gene that one desires to insert into 
a recombinant DNA molecule does not contain an appropriate start signal, such a start signal can be 
inserted upstream (5') of and in reading frame with the gene. 

Expression of Polypeptides 

Isolated Polynucleotide Sequences 

The human chromosome 7 genomic clone of accession number AC006454 has been 
discovered to contain the SNARE YKT6 gene, the human-liv^f glucokinase gene, the AEBP1 gene, and 
the POLD2 gene by Genscan analysis (Burge et aL, 1997, J. Mol. Biol. 268:78-94), BLAST2 and 
TBLASTN analysis (Altschul et aL, 1997, Nucl. Acids Res. 25:3389-3402), in which the sequence of 
AC006454 was compared to the SNARE YKT6 cDNA sequence, accession number NM_006555 
(McNew et aL, 1997, J. Biol. Chem. 272:17776-177783), the human liw*~glucokinase cDNA sequence 
(Tanizawa et aL, 1992, Mol. Endocrinol. 6:1070-1081), accession number NMJ)00162 (major form) 
and M69051 (minor form), , AEBP1 cDNA sequence, accession number NM_001129 (accession 
number D86479 for the osteoblast type) (Layne et aL, 1998, J. Biol. Chem. 273:15654-15660) and the 
POLD2 cDNA sequence, accession number NM_006230 (Zhang et aL, 1995, Genomics 29:179-186). 

The cloning of the nucleic acid sequences of the present invention from such genomic DNA 
can be effected, e.g., by using the well known polymerase chain reaction (PCR) or antibody screening 
of expression libraries to detect cloned DNA fragments with shared structural features. See, e.g., Innis 
et aL, 1990, PCR: A Guide to Methods and Application, Academic Press, New York. Other nucleic 
acid amplification procedures such as ligase chain reaction (LCR), ligated activated transcription (LAT) 
and nucleic acid sequence-based amplification (NASBA) or long chain PCR may be used. In a 
specific embodiment, 5' or 3' non-coding portions of each gene may be identified by methods 
including but are not limited to, filter probing, clone enrichment using specific probes and protocols 
similar or identical to 5' and 3' "RACE" protocols which are well known in the art. For instance, a 
method similar to 5' RACE is available for generating the missing 5' end of a desired full-length 
transcript. (Fromont-Racine et aL, 1993, Nucl. Acids Res. 21:1683-1684). 

Once the DNA fragments are generated, identification of the specific DNA fragment 
containing the desired SNARE YKT6 gene, the human-iiv^p glucokinase gene, the AEBP1 gene, or 
POLD2 gene may be accomplished in a number of ways. For example, if an amount of a portion of a 
SNARE YKT6 gene, the humanliv&r glucokinase -gene, the POLD2 gene or A EBP1 gene, or P OL D2 
gene or its specific RNA, or a fragment thereof, is available and can be purified and labeled, the 
generated DNA fragments may be screened by nucleic acid hybridization to the labeled probe (Benton 
and Davis, 1977, Science 196:180; Grunstein and Hogness, 1975, Proc. Natl. Acad. Sci. U.S.A. 
72:3961). The present invention provides such nucleic acid probes, which can be conveniently 
prepared from the specific sequences disclosed herein, e.g., a hybridizable probe having a nucleotide 
sequence corresponding to at least a 10, and preferably a 15, nucleotide fragment of the sequences 
depicted in SEQ ID NOS:5, 6, 7 or 8. Preferably, a fragment is selected that is highly unique to the 
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encoded polypeptides. Those DNA fragments with substantial homology to the probe will hybridize. 
As noted above, the greater the degree of homology, the more stringent hybridization conditions can 
be used. In one embodiment, low stringency hybridization conditions are used to identify a 
homologous SNARE YKT6, the human Uv&p-glucokinase, the AEBP1, or POLD2 polynucleotide. 
However, in a preferred aspect, and as demonstrated experimentally herein, a nucleic acid encoding a 
polypeptide of the invention will hybridize to a nucleic acid derived from the polynucleotide sequence 
depicted in SEQ ID NOS:5, 6, 7 or 8 or a hybridizable fragment thereof, under moderately stringent 
conditions; more preferably, it will hybridize under high stringency conditions. 

Alternatively, the presence of the gene may be detected by assays based on the physical, 
chemical, or immunological properties of its expressed product. For example, cDNA clones, or DNA 
clones which hybrid-select the proper mRNAs, can be selected which produce a protein that, e.g., has 
similar or identical electrophoretic migration, isoelectric focusing behavior, proteolytic digestion maps, 
or antigenic properties as known for the SNARE YKT6, the human 1+v^-glucokinase, the AEBP1, or 
POLD2 polynucleotide. 

A gene encoding SNARE YKT6, the human Uv^-glucokinase, the AEBP1, or POLD2 
polypeptide can also be identified by mRNA selection, i.e., by nucleic acid hybridization followed by 
in vitro translation. In this procedure, fragments are used to isolate complementary mRNAs by 
hybridization. Immunoprecipitation analysis or functional assays of the in vitro translation products of 
the products of the isolated mRNAs identifies the mRNA and, therefore, the complementary DNA 
fragments, that contain the desired sequences. 

Nucleic Acid Constructs 

The present invention also relates to nucleic acid constructs comprising a polynucleotide 
sequence containing the exon/intron segments of the SNARE YKT6 gene (nucleotides 4320-15463 of 
SEQ ID NO:5), human Uv^glucokinase gene (nucleotides 20485-33460 of SEQ ID NO:6), AEBP1 
gene (nucleotides 1301-13893 of SEQ ID NO:8) or POLD2 gene (nucleotides 11546-18811 of SEQ 
ID NO:7) operably linked to one or more control sequences which direct the expression of the coding 
sequence in a suitable host cell under conditions compatible with the control sequences. Expression 
will be understood to include any step involved in the production of the polypeptide including, but not 
limited to, transcription, post-transcriptional modification, translation, post-translational modification, 
and secretion. 

The invention is further directed to a nucleic acid construct comprising expression control 
sequences derived from SEQ ID NOS: 5, 6, 7 or 8 and a heterologous polynucleotide sequence. 

"Nucleic acid construct" is defined herein as a nucleic acid molecule, either single- or double- 
stranded, which is isolated from a naturally occurring gene or which has been modified to contain 
segments of nucleic acid which are combined and juxtaposed in a manner which would not otherwise 
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exist in nature. The term nucleic acid construct is synonymous with the term expression cassette when 
the nucleic acid construct contains all the control sequences required for expression of a coding 
sequence of the present invention. The term "coding sequence" is defined herein as a portion of a 
nucleic acid sequence which directly specifies the amino acid sequence of its protein product. The 
boundaries of the coding sequence are generally determined by a ribosome binding site (prokaryotes) 
or by the ATG start codon (eukaryotes) located just upstream of the open reading frame at the 5' end 
of the mRNA and a transcription terminator sequence located just downstream of the open reading 
frame at the 3' end of the mRNA. A coding sequence can include, but is not limited to, DNA, cDNA, 
and recombinant nucleic acid sequences. 

The isolated polynucleotide of the present invention may be manipulated in a variety of ways 
to provide for expression of the polypeptide. Manipulation of the nucleic acid sequence prior to its 
insertion into a vector may be desirable or necessary depending on the expression vector. The 
techniques for modifying nucleic acid sequences utilizing recombinant DNA methods are well known 
in the art. 

The control sequence may be an appropriate promoter sequence, a nucleic acid sequence 
which is recognized by a host cell for expression of the nucleic acid sequence. The promoter sequence 
contains transcriptional control sequences which regulate the expression of the polynucleotide. The 
promoter may be any nucleic acid sequence which shows transcriptional activity in the host cell of 
choice including mutant, truncated, and hybrid promoters, and may be obtained from genes encoding 
extracellular or intracellular polypeptides either homologous or heterologous to the host cell. 

Examples of suitable promoters for directing the transcription of the nucleic acid constructs of 
the present invention, especially in a bacterial host cell, are the promoters obtained from the E. coli lac 
operon, the prokaryotic beta-lactamase gene (Villa-Komaroff et al. y 1978, Proc. Natl. Acad. ScL USA 
75: 3727-3731), as well as the tac promoter (DeBoer et ai, 1983, Proc. Natl Acad, of Sciences USA 
80: 21-25). Further promoters are described in "Useful proteins from recombinant bacteria" in 
Scientific American, 1980, 242: 74-94; and in Sambrook et aL, 1989, supra. 

Examples of suitable promoters for directing the transcription of the nucleic acid constructs of 
the present invention in a filamentous fungal host cell are promoters obtained from the genes encoding 
Aspergillus oryzae TAKA amylase, Rhizomucor miehei aspartic proteinase, Aspergillus niger neutral 
alpha-amylase, Aspergillus niger acid stable alpha-amylase, Aspergillus niger or Aspergillus awamori 
glucoamylase (glaA), Rhizomucor miehei lipase, Aspergillus oryzae alkaline protease, Aspergillus 
oryzae triose phosphate isomerase, Aspergillus nidulans acetamidase, Fusarium oxysporum trypsin-like 
protease (WO 96/00787), NA2-tpi (a hybrid of the promoters from the genes encoding Aspergillus 
niger neutral alpha-amylase and Aspergillus oryzae triose phosphate isomerase), and mutant, truncated, 
and hybrid promoters thereof. 
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In a yeast host, useful promoters are obtained from the Saccharomyces cerevisiae enolase 
(ENO-1) gene, the Saccharomyces cerevisiae galactokinase gene (GAL1), the Saccharomyces 
cerevisiae alcohol dehydrogenase/glyceraldehyde-3-phosphate dehydrogenase genes (ADH2/GAP), 
and the Saccharomyces cerevisiae 3-phosphoglycerate kinase gene. Other useful promoters for yeast 
host cells are described by Romanos et al, 1992, Yeast 8: 423-488. 

Eukaryotic promoters may be obtained from the genomes of viruses such as polyoma virus, 
fowlpox virus, adenovirus, bovine papilloma virus, avian sarcoma virus, cytomegalovirus, a retrovirus, 
hepatitis-B virus and SV40. Alternatively, heterologous mammalian promoters, such as the actin 
promoter or immunoglobulin promoter may be used. 

The constructs of the invention may also include enhancers. Enhancers are cis-acting elements 
of DNA, usually from about 10 to about 300 bp that act on a promoter to increase its transcription. 
Enhancers from globin, elastase, albumin, alpha-fetoprotein, and insulin enhancers may be used. 
However, an enhancer from a virus may be used; examples include SV40 on the late side of the 
replication origin, the cytomegalovirus early promoter enhancer, the polyoma enhancer on the late side 
of the replication origin and adenovirus enhancers. 

The control sequence may also be a suitable transcription terminator sequence, a sequence 
recognized by a host cell to terminate transcription. The terminator sequence is operably linked to the 
3' terminus of the nucleic acid sequence encoding the polypeptide. Any terminator which is 
functional in the host cell of choice may be used in the present invention. 

The control sequence may also be a suitable leader sequence, a nontranslated region of an 
mRNA which is important for translation by the host cell. The leader sequence is operably linked to 
the 5' terminus of the nucleic acid sequence encoding the polypeptide. Any leader sequence that is 
functional in the host cell of choice may be used in the present invention. 

The control sequence may also be a polyadenylation sequence, a sequence which is operably 
linked to the 3' terminus of the nucleic acid sequence and which, when transcribed, is recognized by 
the host cell as a signal to add polyadenosine residues to transcribed mRNA. Any polyadenylation 
sequence which is functional in the host cell of choice may be used in the present invention. 

The control sequence may also be a signal peptide coding region, which codes for an amino 
acid sequence linked to the amino terminus of the polypeptide which can direct the encoded 
polypeptide into the cell's secretory pathway. The 5' end of the coding sequence of the nucleic acid 
sequence may inherently contain a signal peptide coding region naturally linked in translation reading 
frame with the segment of the coding region which encodes the secreted polypeptide. Alternatively, 
the 5' end of the coding sequence may contain a signal peptide coding region which is foreign to the 
coding sequence. The foreign signal peptide coding region may be required where the coding 
sequence does not normally contain a signal peptide coding region. Alternatively, the foreign signal 
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peptide coding region may simply replace the natural signal peptide coding region in order to obtain 
enhanced secretion of the polypeptide. However, any signal peptide coding region which directs the 
expressed polypeptide into the secretory pathway of a host cell of choice may be used in the present 
invention. 

The control sequence may also be a propeptide coding region, which codes for an amino acid 
sequence positioned at the amino terminus of a polypeptide. The resultant polypeptide is known as a 
proenzyme or propoly peptide (or a zymogen in some cases). A propolypeptide is generally inactive 
and can be converted to a mature active polypeptide by catalytic or autocatalytic cleavage of the 
propeptide from the propolypeptide. The propeptide coding region may be obtained from the 
Bacillus subtilis alkaline protease gene (aprE), the Bacillus subtilis neutral protease gene (jtprT), the 
Saccharomyces cerevisiae alpha-factor gene, the Rhizomucor miehei aspartic proteinase gene, or the 
Myceliophthora thermophila laccase gene (WO 95/33836). 

Where both signal peptide and propeptide regions are present at the amino terminus of a 
polypeptide, the propeptide region is positioned next to the amino terminus of a polypeptide and the 
signal peptide region is positioned next to the amino terminus of the propeptide region. 

It may also be desirable to add regulatory sequences which allow the regulation of the 
expression of the polypeptide relative to the growth of the host cell. Examples of regulatory systems 
are those which cause the expression of the gene to be turned on or off in response to a chemical or 
physical stimulus, including the presence of a regulatory compound. Regulatory systems in 
prokaryotic systems would include the lac, tac, and trp operator systems. In yeast, the ADH2 system or 
GAL1 system may be used. In filamentous fungi, the TAKA alpha-amylase promoter, Aspergillus 
niger glucoamylase promoter, and the Aspergillus oryzae glucoamylase promoter may be used as 
regulatory sequences. Other examples of regulatory sequences are those which allow for gene 
amplification. In eukaryotic systems, these include the dihydrofolate reductase gene which is 
amplified in the presence of methotrexate, and the metallothionein genes which are amplified with 
heavy metals. In these cases, the nucleic acid sequence encoding the polypeptide would be operably 
linked with the regulatory sequence. 

Expression Vectors 

The present invention also relates to recombinant expression vectors comprising a nucleic acid 
sequence of the present invention, a promoter, and transcriptional and translational stop signals. The 
various nucleic acid and control sequences described above may be joined together to produce a 
recombinant expression vector which may include one or more convenient restriction sites to allow for 
insertion or substitution of the nucleic acid sequence encoding the polypeptide at such sites. 
Alternatively, the polynucleotide of the present invention may be expressed by inserting the nucleic 
acid sequence or a nucleic acid construct comprising the sequence into an appropriate vector for 
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expression. In creating the expression vector, the coding sequence is located in the vector so that the 
coding sequence is operably linked with the appropriate control sequences for expression. 

The recombinant expression vector may be any vector {e.g., a plasmid or virus) which can be 
conveniently subjected to recombinant DNA procedures and can bring about the expression of the 
nucleic acid sequence. The choice of the vector will typically depend on the compatibility of the 
vector with the host cell into which the vector is to be introduced. The vectors may be linear or closed 
circular plasmids. 

The vector may be an autonomously replicating vector, i.e., a vector which exists as an 
extrachromosomal entity, the replication of which is independent of chromosomal replication, e.g., a 
plasmid, an extrachromosomal element, a minichromosome, or an artificial chromosome. The vector 
may contain any means for assuring self-replication. Alternatively, the vector may be one which, when 
introduced into the host cell, is integrated into the genome and replicated together with the 
chromosome(s) into which it has been integrated. Furthermore, a single vector or plasmid or two or 
more vectors or plasmids which together contain the total DNA to be introduced into the genome of 
the host cell, or a transposon may be used. 

The vectors of the present invention preferably contain one or more selectable markers which 
permit easy selection of transformed cells. A selectable marker is a gene the product of which provides 
for biocide or viral resistance, resistance to heavy metals, prototrophy to auxotrophs, and the like. 
Examples of bacterial selectable markers are the dal genes from Bacillus subtilis or Bacillus 
licheniformis, or markers which confer antibiotic resistance such as ampicillin, kanamycin, 
chloramphenicol or tetracycline resistance. Suitable markers for yeast host cells are ADE2, HIS3, 
LEU2, LYS2, MET3, TRP1, and URA3. An example of suitable selectable markers for mammalian 
cells are those that enable the identification of cells competent to take of the nucleic acids of the 
present invention, such as DHFR or thymidine kinase. An appropriate host cell when wild-type DHFR 
is employed is the CHO cell line deficient in DHFR activity, prepared and propagated as described by 
Urlaub et al., Proc. Natl. Acad. Sci. USA, 77:4216 (1980). 

The vectors of the present invention preferably contain an element(s) that permits stable 
integration of the vector into the host cell genome or autonomous replication of the vector in the cell 
independent of the genome of the cell. 

For integration into the host cell genome, the vector may rely on the polynucleotide sequence 
encoding the polypeptide or any other element of the vector for stable integration of the vector into 
the genome by homologous or nonhomologous recombination. Alternatively, the vector may contain 
additional nucleic acid sequences for directing integration by homologous recombination into the 
genome of the host cell. The additional polynucleotide sequences enable the vector to be integrated 
into the host cell genome at a precise location(s) in the chromosome(s). To increase the likelihood of 
integration at a precise location, the integrational elements should preferably contain a sufficient 
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number of nucleic acids, such as 100 to 1,500 base pairs, preferably 400 to 1,500 base pairs, and most 
preferably 800 to 1,500 base pairs, which are highly homologous with the corresponding target 
sequence to enhance the probability of homologous recombination. The integrational elements may 
be any sequence that is homologous with the target sequence in the genome of the host cell. 
Furthermore, the integrational elements may be non-encoding or encoding nucleic acid sequences. On 
the other hand, the vector may be integrated into the genome of the host cell by non-homologous 
recombination. 

For autonomous replication, the vector may further comprise an origin of replication enabling 
the vector to replicate autonomously in the host cell in question. Examples of bacterial origins of 
replication are the origins of replication of plasmids pBR322, pUC19, pACYC177, and pACYC184 
permitting replication in E. coli, and pUBHO, pE194, pTA1060, and pAM§l permitting replication in 
Bacillus. Examples of origins of replication for use in a yeast host cell are the 2 micron origin of 
replication, ARS1, ARS4, the combination of ARS1 and CEN3, and the combination of ARS4 and 
CEN6. The origin of replication may be one having a mutation which makes its functioning 
temperature-sensitive in the host cell (see, e.g., Ehrlich, 1978, Proceedings of the National Academy of 
Sciences USA 75: 1433). 

More than one copy of a polynucleotide sequence of the present invention may be inserted 
into the host cell to increase production of the gene product. An increase in the copy number of the 
polynucleotide sequence can be obtained by integrating at least one additional copy of the sequence 
into the host cell genome or by including an amplifiable selectable marker gene with the nucleic acid 
sequence where cells containing amplified copies of the selectable marker gene, and thereby additional 
copies of the nucleic acid sequence, can be selected for by cultivating the cells in the presence of the 
appropriate selectable agent. 

The procedures used to ligate the elements described above to construct the recombinant 
expression vectors of the present invention are well known to one skilled in the art (see, e.g., Sambrook 
et al, 1989, supra). 

Host Cells 

The present invention also relates to recombinant host cells, comprising a nucleic acid sequence 
of the invention, which are advantageously used in the recombinant production of the polypeptides. A 
vector comprising a nucleic acid sequence of the present invention is introduced into a host cell so that 
the vector is maintained as a chromosomal integrant or as a self-replicating extra-chromosomal vector 
as described earlier. The term "host cell" encompasses any progeny of a parent cell that is not identical 
to the parent cell due to mutations that occur during replication. The choice of a host cell will to a 
large extent depend upon the gene encoding the polypeptide and its source. 
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The host cell may be a unicellular microorganism, e.g., a prokaryote, or a non-unicellular 
microorganism, e.g., a eukaryote. Useful unicellular cells are bacterial cells such as gram positive 
bacteria including, but not limited to, a Bacillus cell, or a Streptomyces cell, e.g., Streptomyces lividans 
or Streptomyces murinus, or gram negative bacteria such as E. coli and Pseudomonas sp. 

The introduction of a vector into a bacterial host cell may, for instance, be effected by 
protoplast transformation (see, e.g., Chang and Cohen, 1979, Molecular General Genetics 168: 111- 
1 15), using competent cells (see, e.g., Young and Spizizin, 1961, Journal of Bacteriology 81: 823-829, 
or Dubnau and Davidoff-Abelson, 1971, Journal of Molecular Biology 56: 209-221), electroporation 
(see, e.g., Shigekawa and Dower, 1988, Biotechniques 6: 742-751), or conjugation (see, e.g., Koehler 
and Thorne, 1987, Journal of Bacteriology 169: 5771-5278). 

The host cell may be a eukaryote, such as a mammalian cell (e.g., human cell), an insect cell, a 
plant cell or a fungal cell. Mammalian host cells that could be used include but are not limited to 
human Hela, embryonic kidney cells (293), lung cells, H9 and Jurkat cells, mouse NIH3T3 and CI 27 
cells, Cos 1, Cos 7 and CV1, quail QC1-3 cells, mouse L cells and Chinese Hamster ovary (CHO) cells. 
These cells may be transfected with a vector containing a transcriptional regulatory sequence, a protein 
coding sequence and transcriptional termination sequences. Alternatively, the polypeptide can be 
expressed in stable cell lines containing the polynucleotide integrated into a chromosome. The co- 
transfection with a selectable marker such as dhfr, gpt, neomycin, hygromycin allows the identification 
and isolation of the transfected cells. 

The host cell may be a fungal cell. "Fungi" as used herein includes the phyla Ascomycota, 
Basidiomycota, Chytridiomycota, and Zygomycota (as defined by Hawksworth et al, In, Ainsworth 
and Bisby's Dictionary of The Fungi, 8th edition, 1995, CAB International, University Press, 
Cambridge, UK) as well as the Oomycota (as cited in Hawksworth et al., 1995, supra, page 171) and all 
mitosporic fungi (Hawksworth et al., 1995, supra). The fungal host cell may also be a yeast cell. 
£)Yeast6 as used herein includes ascosporogenous yeast (Endomycetales), basidiosporogenous yeast, 
and yeast belonging to the Fungi Imperfecti (Blastomycetes). Since the classification of yeast may 
change in the future, for the purposes of this invention, yeast shall be defined as described in Biology 
and Activities of Yeast (Skinner, F.A., Passmore, S.M., and Davenport, R.R., eds, Soc. App. Bacteriol. 
Symposium Series No. 9, 1980). The fungal host cell may also be a filamentous fungal cell. 
"Filamentous fungi" include all filamentous forms of the subdivision Eumycota and Oomycota (as 
defined by Hawksworth et al, 1995, supra). The filamentous fungi are characterized by a mycelial 
wall composed of chitin, cellulose, glucan, chitosan, mannan, and other complex polysaccharides. 
Vegetative growth is by hyphal elongation and carbon catabolism is obligately aerobic. In contrast, 
vegetative growth by yeasts such as Saccharomyces cerevisiae is by budding of a unicellular thallus 
and carbon catabolism may be fermentative. 
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Fungal cells may be transformed by a process involving protoplast formation, transformation 
of the protoplasts, and regeneration of the cell wall in a manner known per se. Suitable procedures for 
transformation of Aspergillus host cells are described in EP 238 023 and Yelton et ai, 1984, 
Proceedings of the National Academy of Sciences USA 81: 1470-1474. Suitable methods for 
transforming Fusarium species are described by Malardier et al, 1989, Gene 78: 147-156 and WO 
96/00787. Yeast may be transformed using the procedures described by Becker and Guarente, In 
Abelson, J.N. and Simon, M.I., editors, Guide to Yeast Genetics and Molecular Biology, Methods in 
Enzymology, Volume 194, pp 182-187, Academic Press, Inc., New York; Ito etaL, 1983, Journal of 
Bacteriology 153: 163; and Hinnen et al, 1978, Proc. e Natl Acad. fSci.s USA 75: 1920. 

Methods of Production 

The present invention also relates to methods for producing a polypeptide of the present 
invention comprising (a) cultivating a host cell under conditions conducive for production of the 
polypeptide; and (b) recovering the polypeptide. 

In the production methods of the present invention, the cells are cultivated in a nutrient 
medium suitable for production of the polypeptide using methods known in the art. For example, the 
cell may be cultivated by shake flask cultivation, small-scale or large-scale fermentation (including 
continuous, batch, fed-batch, or solid state fermentations) in laboratory or industrial fermentors 
performed in a suitable medium and under conditions allowing the polypeptide to be expressed and/or 
isolated. The cultivation takes place in a suitable nutrient medium comprising carbon and nitrogen 
sources and inorganic salts, using procedures known in the art. Suitable media are available from 
commercial suppliers or may be prepared according to published compositions (e.g., in catalogues of 
the American Type Culture Collection). If the polypeptide is secreted into the nutrient medium, the 
polypeptide can be recovered directly from the medium. If the polypeptide is not secreted, it can be 
recovered from cell ly sates. 

The polypeptides may be detected using methods known in the art that are specific for the 
polypeptides. These detection methods may include use of specific antibodies, formation of an 
enzyme product, or disappearance of an enzyme substrate. In a specific embodiment, an enzyme assay 
may be used to determine the activity of the polypeptide. For example, AEBP1 activity can be 
determined by measuring carboxypeptidase activity as described by Muise and Ro, 1999, Biochem. J. 
343:341-345. Here, the conversion of hippuryl-L-arginine, hippuryl-L-lysine or hippuryl-L- 
phenylalanine to hippuric acid may be monitored spectrophotometrically. POLD2 activity may be 
detected by assaying for DNA polymerase _ activity (see, for example, Ng et al., 1991, J. Biol. Chem. 
266:11699-11704). 
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The resulting polypeptide may be recovered by methods known in the art. For example, the 
polypeptide may be recovered from the nutrient medium by conventional procedures including, but 
not limited to, centrifugation, filtration, extraction, spray-drying, evaporation, or precipitation. 

The polypeptides of the present invention may be purified by a variety of procedures known in 
the art including, but not limited to, chromatography {e.g., ion exchange, affinity, hydrophobic, 
chromatofocusing, and size exclusion), electrophoretic procedures {e.g., preparative isoelectric 
focusing, differential solubility {e.g., ammonium sulfate precipitation), SDS-PAGE, or extraction (see, 
e.g., Protein Purification, J.-C. Janson and Lars Ryden, editors, VCH Publishers, New York, 1989). 

Antibodies 

According to the invention, the SNARE YKT6, human glucokinase, AEBP1 or POLD2 
polypeptides produced according to the method of the present invention may be used as an 
immunogen to generate any of these polypeptides. Such antibodies include but are not limited to 
polyclonal, monoclonal, chimeric, single chain, Fab fragments, and an Fab expression library. 

Various procedures known in the art may be used for the production of antibodies. For the 
production of antibody, various host animals can be immunized by injection with the polypeptide 
thereof, including but not limited to rabbits, mice, rats, sheep, goats, etc. In one embodiment, the 
polypeptide or fragment thereof can optionally be conjugated to an immunogenic carrier, e.g., bovine 
serum albumin (BSA) or keyhole limpet hemocyanin (KLH). Various adjuvants may be used to 
increase the immunological response, depending on the host species, including but not limited to 
Freund's (complete and incomplete), mineral gels such as aluminum hydroxide, surface active 
substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet 
hemocyanins, dinitrophenol, and potentially useful human adjuvants such as BCG (bacille Calmette- 
Guerin) and Corynebacterium parvum. 

For preparation of monoclonal antibodies directed toward the SNARE YKT6, human 
glucokinase, AEBP1 or POLD2 polypeptide, any technique that provides for the production of 
antibody molecules by continuous cell lines in culture may be used. These include but are not limited 
to the hybridoma technique originally developed by Kohler and Milstein (1975, Nature 256:495-497), 
as well as the trioma technique, the human B-cell hybridoma technique (Kozbor et al., 1983, 
Immunology Today 4:72), and the EBV-hybridoma technique to produce human monoclonal 
antibodies (Cole et al., 1985, in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77- 
96). In an additional embodiment of the invention, monoclonal antibodies can be produced in germ- 
free animals utilizing recent technology (PCT/US90/02545). According to the invention, human 
antibodies may be used and can be obtained by using human hybridomas (Cote et al., 1983, Proc. 
Natl. Acad. Sci. U.S.A. 80:2026-2030) or by transforming human B cells with EBV virus in vitro (Cole 
et al., 1985, in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, pp. 77-96). In fact, 
according to the invention, techniques developed for the production of "chimeric antibodies" 
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(Morrison et al., 1984, J. Bacteriol. 159-870; Neuberger et al., 1984, Nature 312:604-608; Takeda et 
al., 1985, Nature 314:452-454) by splicing the genes from a mouse antibody molecule specific for the 
SNARE YKT6, human glucokinase, AEBP1 or POLD2 polypeptide together with genes from a human 
antibody molecule of appropriate biological activity can be used; such antibodies are within the scope 
of this invention. 

According to the invention, techniques described for the production of single chain antibodies 
(U.S. Pat. No. 4,946,778) can be adapted to produce polypeptide-specific single chain antibodies. An 
additional embodiment of the invention utilizes the techniques described for the construction of Fab 
expression libraries (Huse et al., 1989, Science 246:1275-1281) to allow rapid and easy identification 
of monoclonal Fab fragments with the desired specificity for the SNARE YKT6, AEBP1, human 
glucokinase or POLD2 polypeptides. 

Antibody fragments which contain the idiotype of the antibody molecule can be generated by 
known techniques. For example, such fragments include but are not limited to: the F(ab')2 fragment 
which can be produced by pepsin digestion of the antibody molecule; the Fab* fragments which can be 
generated by reducing the disulfide bridges of the F(ab')2, fragment,and the Fab fragments which can 
be generated by treating the antibody molecule with papain and a reducing agent. 

In the production of antibodies, screening for the desired antibody can be accomplished by 
techniques known in the art, e.g., radioimmunoassay, ELISA (enzyme-linked immunosorbent assay), 
"sandwich" immunoassays, immunoradiometric assays, gel diffusion precipitin reactions, 
immunodiffusion assays, in situ immunoassays (using colloidal gold, enzyme or radioisotope labels, 
for example), western blots, precipitation reactions, agglutination assays (e.g., gel agglutination assays, 
hemagglutination assays), complement fixation assays, immunofluorescence assays, protein A assays, 
and immunoelectrophoresis assays, etc. In one embodiment, antibody binding is detected by detecting 
a label on the primary antibody. In another embodiment, the primary antibody is detected by detecting 
binding of a secondary antibody or reagent to the primary antibody. In a further embodiment, the 
secondary antibody is labeled. Many means are known in the art for detecting binding in an 
immunoassay and are within the scope of the present invention. For example, to select antibodies which 
recognize a specific epitope of a particular polypeptide, one may assay generated hybridomas for a 
product which binds to a particular polypeptide fragment containing such epitope. For selection of an 
antibody specific to a particular polypeptide from a particular species of animal, one can select on the 
basis of positive binding with the polypeptide expressed by or isolated from cells of that species of 
animal. 

Immortal, antibody-producing cell lines can also be created by techniques other than fusion, 
such as direct transformation of B lymphocytes with oncogenic DNA, or transfection with Epstein-Barr 
virus. See, e.g., M. Schreier et al., "Hybridoma Techniques" (1980); Hammerling et al., "Monoclonal 
Antibodies And T-cell Hybridomas" (1981); Kennett et al., "Monoclonal Antibodies" (1980); see also 
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U.S. Pat. Nos. 4,341,761; 4,399,121; 4,427,783; 4,444,887; 4,451,570; 4,466,917; 4,472,500; 
4,491,632; 4,493,890. 

Uses of Polynucleotides 

Diagnostics 

Polynucleotides containing noncoding regions of SEQ ID NOS:5, 6, 7 or 8 may be used as 
probes for detecting mutations from samples from a patient. Genomic DNA may be isolated from the 
patient. A mutation(s) may be detected by Southern blot analysis, specifically by hybridizing 
restriction digested genomic DNA to various probes and subjecting to agarose electrophoresis. 

Polynucleotides containing noncoding regions may be used as PCR primers and may be used 
to amplify the genomic DNA isolated from the patients. Additionally, primers may be obtained by 
routine or long range PCR, that can yield products containing more than one exon and intervening 
intron. The sequence of the amplified genomic DNA from the patient may be determined using 
methods known in the art. Such probes may be between 10-100 nucleotides in length and may 
preferably be between 20-50 nucleotides in length. 

Thus the invention is thus directed to kits comprising these polynucleotide probes. In a 
specific embodiment, these probes are labeled with a detectable substance. 

Antisense Oligonucleotides and Mimetics 

The invention is further directed to antisense oligonucleotides and mimetics to these 
polynucleotide sequences. Antisense technology can be used to control gene expression through 
triple-helix formation or antisense DNA or RNA, both of which methods are based on binding of a 
polynucleotide to DNA or RNA. A DNA oligonucleotide is designed to be complementary to a region 
of the gene involved in transcription or RNA processing (triple helix (see Lee et al., Nucl. Acids Res., 
6:3073 (1979); Cooney et al, Science, 241:456 (1988); and Dervan et al., Science, 251: 1360 (1991)), 
thereby preventing transcription and the production of said polypeptides. 

The antisense oligonucleotides or mimetics of the present invention may be used to decrease 
levels of a polypeptide. For example, SNARE YKT6 has been found to be essential for vesicle- 
associated endoplasmic reticulum-Golgi transport and cell growth. Therefore, the SNARE YKT6 
antisense oligonucleotides of the present invention could be used to inhibit cell growth and in 
particular, to treat or prevent tumor growth. POLD2 is necessary for DNA replication. POLD2 
antisense sequences could also be used to inhibit cell growth. Glucokinase and AEBP1 antisense 
sequences may be used to treat hyperglycemia. 

The antisense oligonucleotides of the present invention may be formulated into pharmaceutical 
compositions. These compositions may be administered in a number of ways depending upon whether 
local or systemic treatment is desired and upon the area to be treated. Administration may be topical 
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(including ophthalmic and to mucous membranes including vaginal and rectal delivery), pulmonary, 
e.g., by inhalation or insufflation of powders or aerosols, including by nebulizer; intratracheal, 
intranasal, epidermal and transdermal), oral or parenteral. Parenteral administration includes 
intravenous, intraarterial, subcutaneous, intraperitoneal or intramuscular injection or infusion; or 
intracranial, e.g., intrathecal or intraventricular, administration. 

Pharmaceutical compositions and formulations for topical administration may include 
transdermal patches, ointments, lotions, creams, gels, drops, suppositories, sprays, liquids and powders. 
Conventional pharmaceutical carriers, aqueous, powder or oily bases, thickeners and the like may be 
necessary or desirable. 

Compositions and formulations for oral administration include powders or granules, 
suspensions or solutions in water or non-aqueous media, capsules, sachets or tablets. Thickeners, 
flavoring agents, diluents, emulsifiers,dispersing aids or binders may be desirable. 

Compositions and formulations for parenteral, intrathecal or intraventricular administration 
may include sterile aqueous solutions which may also contain buffers, diluents and other suitable 
additives such as, but not limited to, penetration enhancers, carrier compounds and other 
pharmaceutical^ acceptable carriers or excipients. 

Pharmaceutical compositions of the present invention include, but are not limited to, solutions, 
emulsions, and liposome-containing formulations. These compositions may be generated from a 
variety of components that include, but are not limited to, preformed liquids, self-emulsifying solids 
and self-emulsifying semisolids. 

The pharmaceutical formulations of the present invention, which may conveniently be 
presented in unit dosage form, may be prepared according to conventional techniques well known in 
the pharmaceutical industry. Such techniques include the step of bringing into association the active 
ingredients with the pharmaceutical carrier(s) or excipient(s). In general, the formulations are prepared 
by uniformly and intimately bringing into association the active ingredients with liquid carriers or 
finely divided solid carriers or both, and then, if necessary, shaping the product. 

The compositions of the present invention may be formulated into any of many possible 
dosage forms such as, but not limited to, tablets, capsules, liquid syrups, soft gels, suppositories, and 
enemas. The compositions of the present invention may also be formulated as suspensions in aqueous, 
non-aqueous or mixed media. Aqueous suspensions may further contain substances which increase the 
viscosity of the suspension including, for example, sodium carboxymethylcellulose, sorbitol and/or 
dextran. The suspension may also contain stabilizers. 

In one embodiment of the present invention, the pharmaceutical compositions may be 
formulated and used as foams. Pharmaceutical foams include formulations such as, but not limited to, 
emulsions, microemulsions, creams, jellies and liposomes. While basically similar in nature these 
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formulations vary in the components and the consistency of the final product. The preparation of such 
compositions and formulations is generally known to those skilled in the pharmaceutical and 
formulation arts and may be applied to the formulation of the compositions of the present invention. 

The formulation of therapeutic compositions and their subsequent administration is believed to 
be within the skill of those in the art. Dosing is dependent on severity and responsiveness of the 
disease state to be treated, with the course of treatment lasting from several days to several months, or 
until a cure is effected or a diminution of the disease state is achieved. Optimal dosing schedules can be 
calculated from measurements of drug accumulation in the body of the patient. Persons of ordinary 
skill can easily determine optimum dosages, dosing methodologies and repetition rates. Optimum 
dosages may vary depending on the relative potency of individual oligonucleotides, and can generally 
be estimated based on EC50 as found to be effective in in vitro and in vivo animal models. 

In general, dosage is from 0.01 ug to 10 g per kg of body weight, and may be given once or 
more daily, weekly, monthly or yearly, or even once every 2 to 20 years. Persons of ordinary skill in 
the art can easily estimate repetition rates for dosing based on measured residence times and 
concentrations of the drug in bodily fluids or tissues. Following successful treatment, it may be 
desirable to have the patient undergo maintenance therapy to prevent the recurrence of the disease 
state, wherein the oligonucleotide is administered in maintenance doses, ranging from 0.01 ug to 10 g 
per kg of body weight, once or more daily, to once every 20 years. 

Gene Therapy 

As noted above, SNARE YKT6 is necessary for cell growth, POLD2 is involved in DNA 
replication and repair, AEBP1 is involved in repressing adipogenesis and glucokinase is involved in 
glucose sensing in pancreatic islet beta cells and liver. Therefore, the SNARE YKT6 gene may be used 
to modulate or prevent cell apoptosis and treat such disorders as virus-induced lymphocyte depletion 
(AIDS); cell death in neurodegenerative disorders characterized by the gradual loss of specific sets of 
neurons (e.g., Alzheimer's Disease, Parkinson's disease, ALS, retinitis pigmentosa, spinal muscular 
atrophy and various forms of cerebellar degeneration), cell death in blood cell disorders resulting from 
deprivation of growth factors (anemia associated with chronic disease, aplastic anemia, chronic 
neutropenia and myelodysplastic syndromes) and disorders arising out of an acute loss of blood flow 
(e.g., myocardial infarctions and stroke). The glucokinase gene may be used to treat diabetes mellitus. 
The AEBP1 gene may be used to modulate or inhibit adipogenesis and treat obesity, diabetes mellitus 
and/or osteopenic disorders. POLD2 may be used to treat defects in DNA repair such as xeroderma 
pigmentosum, progeria and ataxia telangiectasia. 

As described herein, the polynucleotide of the present invention may be introduced into a 
patient's cells for therapeutic uses. As will be discussed in further detail below, cells can be transfected 
using any appropriate means, including viral vectors, as shown by the example, chemical transfectants, 
or physico-mechanical methods such as electroporation and direct diffusion of DNA. See, for example, 
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Wolff, Jon A, et al., "Direct gene transfer into mouse muscle in vivo," Science, 247, 1465-1468, 1990; 
and Wolff, Jon A, "Human dystrophin expression in mdx mice after intramuscular injection of DNA 
constructs," Nature, 352, 815-818, 1991. As used herein, vectors are agents that transport the gene into 
the cell without degradation and include a promoter yielding expression of the gene in the cells into 
which it is delivered. As will be discussed in further detail below, promoters can be general promoters, 
yielding expression in a variety of mammalian cells, or cell specific, or even nuclear versus cytoplasmic 
specific. These are known to those skilled in the art and can be constructed using standard molecular 
biology protocols. Vectors have been divided into two classes: 

a) Biological agents derived from viral, bacterial or other sources. 

b) Chemical physical methods that increase the potential for gene uptake, directly introduce the 
gene into the nucleus or target the gene to a cell receptor. 

Biological Vectors 

Viral vectors have higher transaction (ability to introduce genes) abilities than do most 
chemical or physical methods to introduce genes into cells. Vectors that may be used in the present 
invention include viruses, such as adenoviruses, adeno associated virus (AAV), vaccinia, herpesviruses, 
baculoviruses and retroviruses, bacteriophages, cosmids, plasmids, fungal vectors and other 
recombination vehicles typically used in the art which have been described for expression in a variety 
of eukaryotic and prokaryotic hosts, and may be used for gene therapy as well as for simple protein 
expression. Polynucleotides are inserted into vector genomes using methods well known in the art. 

Retroviral vectors are the vectors most commonly used in clinical trials, since they carry a 
larger genetic payload than other viral vectors. However, they are not useful in non-proliferating cells. 
Adenovirus vectors are relatively stable and easy to work with, have high titers, and can be delivered in 
aerosol formulation. Pox viral vectors are large and have several sites for inserting genes, they are 
thermostable and can be stored at room temperature. 

Examples of promoters are SP6, T4, T7, SV40 early promoter, cytomegalovirus (CMV) 
promoter, mouse mammary tumor virus (MMTV) steroid-inducible promoter, Moloney murine 
leukemia virus (MMLV) promoter, phosphoglycerate kinase (PGK) promoter, and the like. 
Alternatively, the promoter may be an endogenous adenovirus promoter, for example the El a 
promoter or the Ad2 major late promoter (MLP). Similarly, those of ordinary skill in the art can 
construct adenoviral vectors utilizing endogenous or heterologous poly A addition signals. 

Plasmids are not integrated into the genome and the vast majority of them are present only 
from a few weeks to several months, so they are typically very safe. However, they have lower 
expression levels than retroviruses and since cells have the ability to identify and eventually shut down 
foreign gene expression, the continuous release of DNA from the polymer to the target cells 
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substantially increases the duration of functional expression while maintaining the benefit of the safety 
associated with non-viral transfections. 

Chemical/physical vectors 

Other methods to directly introduce genes into cells or exploit receptors on the surface of cells 
include the use of liposomes and lipids, ligands for specific cell surface receptors, cell receptors, and 
calcium phosphate and other chemical mediators, microinjections directly to single cells, 
electroporation and homologous recombination. Liposomes are commercially available from Gibco 
BRL, for example, as LIPOFECTIN" and LIPOFECTACE which are formed of cationic lipids such as 
N-[l-(2,3 dioleyloxy)-propyl]-n,n,n-trimethylammonium chloride (DOTMA) and dimethyl 
dioctadecylammonium bromide (DDAB). Numerous methods are also published for making 
liposomes, known to those skilled in the art. 

For example, Nucleic acid-Lipid Complexes-Lipid carriers can be associated with naked 
nucleic acids (e.g., plasmid DNA) to facilitate passage through cellular membranes. Cationic, anionic, 
or neutral lipids can be used for this purpose. However, cationic lipids are preferred because they have 
been shown to associate better with DNA which, generally, has a negative charge. Cationic lipids have 
also been shown to mediate intracellular delivery of plasmid DNA (Feigner and Ringold, Nature 
337:387 (1989)). Intravenous injection of cationic lipid-plasmid complexes into mice has been shown 
to result in expression of the DNA in lung (Brigham et al., Am. J. Med. Sci.298:278 (1989)). See also, 
Osaka et al., J. Pharm. Sci. 85(6):612-618 (1996); San et ah, Human Gene Therapy 4:781-788 (1993); 
Senior et al., Biochemica et Biophysica Acta 1070:173-179 (1991); Kabanov and Kabanov, 
Bioconjugate Chem. 6:7-20 (1995); Remy et al., Bioconjugate Chem. 5:647-654 (1994); Behr, J-P., 
Bioconjugate Chem 5:382-389 (1994); Behr et al., Proc. Natl. Acad. Sci., USA 86:6982-6986 (1989); 
and Wyman et al., Biochem. 36:3008-3017 (1997). 

Cationic lipids are known to those of ordinary skill in the art. Representative cationic lipids 
include those disclosed, for example, in U.S. Pat. No. 5,283,185; and e.g., U.S. Pat. No. 5,767,099. In a 
preferred embodiment, the cationic lipid is N4 -spermine cholesteryl carbamate (GL-67) disclosed in 
U.S. Pat. No. 5,767,099. Additional preferred lipids include N4 -spermidine cholestryl carbamate 
(GL-53) and 1-(N4 -spermind) -2,3-dilaurylglycerol carbamate (GL-89). 

The vectors of the invention may be targeted to specific cells by linking a targeting molecule to 
the vector. A targeting molecule is any agent that is specific for a cell or tissue type of interest, 
including for example, a ligand, antibody, sugar, receptor, or other binding molecule. 

Invention vectors may be delivered to the target cells in a suitable composition, either alone, or 
complexed, as provided above, comprising the vector and a suitably acceptable carrier. The vector may 
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be delivered to target cells by methods known in the art, for example, intravenous, intramuscular, 
intranasal, subcutaneous, intubation, lavage, and the like. The vectors may be delivered via in vivo or ex 
vivo applications. In vivo applications involve the direct administration of an adenoviral vector of the 
invention formulated into a composition to the cells of an individual. Ex vivo applications involve the 
transfer of the adenoviral vector directly to harvested autologous cells which are maintained in vitro, 
followed by readministration of the transduced cells to a recipient. 

In a specific embodiment, the vector is transfected into antigen-presenting cells. Suitable 
sources of antigen-presenting cells (APCs) include, but are not limited to, whole cells such as dendritic 
cells or macrophages; purified MHC class I molecule complexed to §2-microglobulin and foster 
antigen-presenting cells. In a specific embodiment, the vectors of the present invention may be 
introduced into T cells or B cells using methods known in the art (see, for example, Tsokos and 
Nepom, 2000, J. Clin. Invest. 106:181-183). 

♦ The invention described and claimed herein is not to be limited in scope by the specific 
embodiments herein disclosed, since these embodiments are intended as illustrations of several aspects 
of the invention. Any equivalent embodiments are intended to be within the scope of this invention. 
Indeed, various modifications of the invention in addition to those shown and described herein will 
become apparent to those skilled in the art from the foregoing description. Such modifications are 
also intended to fall within the scope of the appended claims. 

Various references are cited herein, the disclosure of which are incorporated by reference in 
their entireties. 
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WHAT IS CLAIMED IS: 

1 . (currently amended) An isolated genomic polvnucleotide n ucleic acid molecule , 
said polynucleotide n ucleic acid molecule obtainable from human chromosome 7 having a 
nucleotide sequence at least 95% identical to a sequence selected from the group consisting 

of: 

(a) a polynucleotide n ucleic acid molecule encoding a polypeptide selected from 
the group consisting of human SNARE YKT6 depicted in SEQ ID NO:l, human Uwzt 
glucokinase depicted in SEQ ID NO:2, human adipocyte enhancer binding protein JLdepicted 
in SEQ ID NO: 3 and DNA directed 50kD regulatory subunit (POLD2) depicted in SEQ ID 
NO:4 and variants thereof ; 

te^a polynucleotide n ucleic acid molecule selected from the group consisting of SEQ 
ID NO:5 which encodes human SNARE YKT6 depicted in SEQ ID NO:l, SEQ ID NO:6 
which encodes human liv^r-glucokinase depicted in SEQ ID NO:2, SEQ ID NO:8 which 
encodes human adipocyte enhancer binding protein Ldepicted in SEQ ID NO: 3 and SEQ ID 
NO:7 which encodes DNA directed 50kD regulatory subunit (POLD2) depicted in SEQ ID 
NO : 4 and variants thereofc - 

(c) a nucleic acid molecule extending from the 5 '-end of SEP ID NO:5 to the 3 ? -end 

of SEP ID NO:8 that comprises the contiguous coding sequences for SNARE YKT6, 
glucokinase, POLD2 and the adipocyte enhancer binding protein 1; 



a..ffoly4H*de-ol4^ of SEQ4P~NQS :5 > 6, 7 , o r4k 



(4} a p olynucl eotide whi ch is an a ll e lic v ariant of SEQ ID NOS:5, 6> 7, or 8; 

(-e) a polynucleotide wh i c h e ncodes a v ar i ant of S EQ ID NQS: K2 ! , 3, or 4 ; 

(fd) a p olynucl eotid e nucleic acid molecule which hybridizes to any one of the 
polynucleotides specified in (a)-(ec) 

(e) a polynucleotide n ucleic acid molecule which is a reverse complement of the 
polynucleotides specified in (a)-(fc); 

2. (currently amended) A nucleic acid construct comprising the polynucleotide 
nucleic acid molecule of claim 1. 

3 . (currently amended) An expression vector comprising the p oly nucleotide n ucleic 
acid molecule of claim 1. 

4. (original) A recombinant host cell comprising the nucleic acid construct 
molecule of claim 12-. 




94 



Claim 5 (cancelled) 

6. (currently amended) A method for obtaining a polypeptide encoded by a 
polynucleotide n ucleic acid molecule obtainable from human chromosome 7, said 
polypeptide selected from the group consisting of human SNARE YKT6, human Uvet 
glucokinase, human adipocyte enhancer binding protein l_and DNA directed 50kD 
regulatory subunit (POLD2) comprising: 

(a) culturing the recombinant host cell of claim-54 under conditions that provide for 
the expression of said polypeptide and 

(b) recovering said expressed polypeptide. 

7 . (currently amended) A method for preparing an antibody specific to a polypeptide 
selected from the group consisting of human SNARE YKT6, human liw^-glucokinase, human 
adipocyte enhancer binding protein Land DNA directed 50kD regulatory subunit (POLD2) 
comprising: 

(a) obtaining a polypeptide according to the method of claim 6; 

(b) optionally conjugating said polypeptide to a carrier protein; 

(c) immunizing a host animal with said polypeptide or polypeptide-carrier protein 
conjugate of step (b) with an adjuvant and 

(d) obtaining antibody from said immunized host animal. 

8 . (currently amended) An antisense oligonucleotide or mimetic to an isolated 
po l ynucleotide i solated nucleic acid molecule of at least 15 nucleot ides or mimetic w hich 
hybridizes at high stringency to a non-coding region <rf- specific to SEQ ID NQS:5, 6, 7 or 
£the nucleic acid molecule of claim 1 , which non-coding region is selected from the group 
consisting of an intron, a splice junction, a 5' non-coding region, a transcription factor 
binding region , an expression control region and a 3' non-coding region. 

9. (currently amended) A method of diagnosing a pathological condition or 
susceptibility to a pathological condition in a subject comprising: 

(a) determining the presence or absence of a mutation in the polynucleotide of claim 

land 

- — ( -b )- di a gn o s ing a pa th olog ic a l c on dit i o n o r a susceptibility to a pathological condition 
based on the p r esence or absence of said mutation 
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(a) isolating genomic DNA from a subject; 

(b) determining the presence or absence of a variant in said genomic DNA 
u sing the nucleic acid molecule of claim 8 and 

(c) ^diagnosing a pathologic al c o n di t ion or a susceptibil ity to a pathological 
condition based on the presence or absence of said variant. 

10. (currently amended) A composition comprising the polynucl eo t i de nucleic acid molecule 
of claim 1 and a carrier. 

1 1 . (currently amended) A composition comprising the ant i sense 
oligonucleotide n ucleic acid molecule of claim 8 and a carrier. 

12. (original) A method for preventing, treating or ameliorating a medical condition, 
comprising administering to a subject an amount of the composition of claim 10 effective to 
prevent, treat or ameliorate said medical condition. 

1 3 . (original) A method for preventing, treating or ameliorating a medical 
condition, comprising administering to a subject an amount of the composition of claim 1 1 
effective to prevent, treat or ameliorate said medical condition. 

14. (currently amended) A kit comprising the polynucleotide nucleic acid molecule of 
claim 4-8. 

15. (original) The kit according to claim 14, in which the polynucleotide is labeled 
with a detectable substance. 

16. (currently amended) A kit comprising the antisense oligonucleotide or mimetic of 
claim 8 / The kit according to claim 14, which comprises a plurality of nucleic acid molecules. 

Claims 17-22 are cancelled. 
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23. (new) A method for modulating levels of human SNARE YKT6, human glucokinase, 
human adipocyte enhancer binding protein 1 or DNA directed 50kD regulatory subunit 
(POLD2) in a subject in need thereof comprising administering to said subject an amount of 
the nucleic acid molecule of claim 1 effective to modulate said human SNARE YKT6, human 
glucokinase, human adipocyte enhancer binding protein 1 or DNA directed 50kD regulatory 
subunit (POLD2) levels. 

24. (new) A method for modulating levels of human SNARE YKT6, human glucokinase, 
human adipocyte enhancer binding protein 1 or DNA directed 50kD regulatory subunit 
(POLD2) in a subject in need thereof comprising administering to said subject an amount of 
the nucleic acid molecule of claim 8 effective to modulate said human SNARE YKT6, human 
glucokinase, human adipocyte enhancer binding protein 1 or DNA directed 50kD regulatory 
subunit (POLD2) levels. 

25. (new) A method of identifying variants of SEQ ID NOS: 5, 6, 7 or 8 comprising 

(a) isolating genomic DNA from a subject and 

(b) determining the presence or absence of a variant in said genomic DNA using the nucleic 
acid molecule of claim 8. 

26. (new) A method for detecting the presence or absence of a non-coding nucleic acid 
sequence specific to the nucleic acid molecule of claim 1 in a sample, said method 
comprising contacting the sample with a nucleic acid molecule of at least 15 nucleotides 
which hybridizes at high stringency to a non-coding region specific to the nucleic acid 
molecule of claim 1, which non-coding region is selected from the group consisting of an 
intron, a splice junction, a 5' non-coding region, a transcription factor binding region, an 
expression control region and a 3' non-coding region. 
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ABSTRACT 

The invention is directed to isolated genomic polynucleotide fragments that encode human 
SNARE YKT6, human liv^-glucokinase, human adipocyte enhancer binding protein (AEBP1) and 
DNA directed 50kD regulatory subunit (POLD2), vectors and hosts containing these fragments and 
fragments hybridizing to noncoding regions as well as antisense oligonucleotides to these fragments. 
The invention is further directed to methods of using these fragments to obtain SNARE YKT6, human 
iiwp-glucokinase, AEBP1 protein and POLD2 and to diagnose, treat, prevent and/or ameliorate a 
pathological disorder. 
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